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VARIANCE DESIGNS IN PSYCHOLOGICAL RESEARCH 


LEONARD S. KOGAN 
Institute of Welfare Research, Community Service Society of New York 


About a decade ago Garrett and 
Zubin (49), surveying applications 
and the potential utility of analysis 
of variance in psychological research 
design, pointed out that such tech- 
niques had not yet been widely em- 
ployed. Since that time the number 
of psychological studies using analy- 
sis of variance has become so large 
that even a listing of titles would be 
of prohibitive length. Several statis- 
tical texts emphasizing variance anal- 
ysis in psychological research have 
since appeared (37, 76, 101) as well as 
numerous methodological articles 
written by psychologists.! 

The purpose of the present review 
is to indicate the directions and ex- 
tent to which analysis of variance de- 
signs have been applied in recent psy- 
chological research. For the most 
part references are drawn from papers 
appearing in the Journal of Experi- 
mental Psychology and the Journal of 
Comparative and Physiological Psy- 
chology during recent years. No at- 
tempt has been made, however, to 
make an exhaustive survey of such 
applications, special emphasis being 
placed on papers where problem for- 
mulation, design, analysis, and infer- 
ences are presented in sufficient de- 
tail for the reader to grasp essential 
methodology and thus implement his 
understanding of experimental design 


1 Edwards (37) presents a bibliography of 
many of these articles 


and analysis over that obtainable 
from the typical artificialities of sta- 
tistical texts. It will be assumed that 
the reader is acquainted with the 
basic concepts and computational 
procedures for analysis of variance to 
the level of Edwards (37) and Mc- 
Nemar (101). Reference will be made 
to other readily available sources 
when necessary. 

Problems of terminology present 
difficulties in any discussion of experi- 
mental designs. Psychologists have 
not been consistent in taking over 
Fisherian terminology (44, 45). 
While terms such as “factorial de- 
sign,’ “latin square,’ “treatment,” 
“replication,’’ and others have gained 
widespread usage, terms such as 
“block,”’ “‘plot,’’ ‘“‘varieties,’’ etc., 
have apparently seemed too agro- 
nomic to be commonly used by psy- 
chologists. In the material to follow, 
popular terminology such as that 
used by Edwards (37) will be fol- 
lowed, with some attention being 
paid to alternative names which have 
been used. 


‘ 


SINGLE-CLASSIFICATION DESIGNS 


Most statistical texts introduce the 
topic of anaiysis of variance by de- 
scribing the partitioning of sums of 
squares (SS) and degrees of freedom 
(df) in the case of the single-classifica- 
tion design. The essence of this de- 
sign is the presence of a single cri- 


1 
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terion of classification usually repre- 
sented by several independent groups 
of Ss upon whom the same measure- 
ments have been taken (49). The 
several groups typically involve the 
application of different experimental 
treatments. The usual problem to be 
answered by the analysis is whether 
the means of the several groups differ 
more among themselves than can be 
attributed to random-sampling varia- 
tion from a common population. The 
over-all test of the significance of dif- 
ferences among the means is provided 
by an F ratio with the numerator de- 
rived froin the variation of the several 
means and the denominator based on 
“pooling’’ the individual differences 
within the several groups. Other 
common names for this design are 
single-factor design, one-factor de- 
sign, one-way classification, two-part 
analysis of variance, single-variable 
design, simple analysis of variance, 
between and within analysis, and 
simple classification of variates. 

The most frequently used form of 
this design is the two-group case 
where the number of observations in 
each of the groups may be either 
equal or unequal. For this case the 
traditional method of analysis is the 
t test with k(=N,+N.—2)df or the 
equivalent F ratio with one df in the 
numerator and k df in the denomina- 
tor. Because of its widespread famili- 
arity, no illustrations of the two- 
group case will be presented. 

The extension of the single-classifi- 
cation design to more ‘than two 
groups, despite its simplicity, is not 
frequently found in the literature. 
Franklin and Brozek (47), investizat- 
ing the relationship between psycho- 
motor performance and type of prac- 
tice schedule, made use of a single- 
classification analysis. Thirty-six Ss 
were allocated to six equal groups 
with comparable means and stand- 
ard deviations on the basis of per- 


formance in ‘“‘try-out” trials. The 
groups were then assigned different 
practice schedules on two psycho- 
motor tests, e.g., three trials a day, 
three trials a week, etc. The single- 
classification design was applied in 
testing the over-all significance of dif- 
ferences among the six group means 
at specified trials. The analysis of 
variance, say, at the ninth trial ap- 
peared in the form shown in Table 1. 


TABLE 1 


ANALYSIS OF VARIANCE AT SPECIFIED 
TRIAL (47) 


Source of Variation 





Between groups 
Within groups 


Total 





A slight complication appears in 
the single-classification design when 
the number of Ss in each of the sev- 
eral groups is unequal. The computa- 
tional method of correcting for un- 
equal N’s by dividing the total 
squared for each group by its own N 
is readily found in all texts. Ammons 
(2) used this so-called unbalanced 
single-classification design in a study 
of rotor pursuit performance where 
eight unequal groups were given dif- 
ferent conditions of pre-practice 
warming-up activity. As in the pre- 
ceding illustration, Ammons used the 
design to test the over-all significance 
of differences among group means at 
specified trials. A more general exam- 
ple of the single-classification design 
with unequal N’s in the groups is pro- 
vided by Kelman (82) in a study in- 
volving the comparison of suggesti- 
bility scores for four groups of Ss 
classified as Control, Success, Failure, 
and Ambiguous. 

Comment. The single-classification 
design is the prototype of the classical 
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experimental dictum of keeping all 
factors constant but the one being in- 
vestigated. Reliable inference from 
this design demands that all condi- 
tions other than those which distin- 
guish the several experimental groups 
be kept comparable from group to 
group or at least completely random- 
ized among the groups. All variation 
over and above the differences among 
means is used to make the estimate of 
chance fluctuation or experimental 
error. Whenever possible, the $s 
should be assigned to the several 
groups in a random manner. Large 
individual differences or heterogen- 
eity of response among Ss within the 
same group enter into the estimate of 
error and may mask small but real 
differences among the groups. Fail- 
ure to reject the null hypothesis, i.e., 
equality of the several means, is thus 
often attributable to small size of 
samples. If, on the other hand, the Ss 
of the experiment are kept markedly 
homogeneous by having them all of 
the same age, sex, IQ, education, 
etc., significant differences among 
groups may be found as a function of 
experimental variations, but the ex- 
perimenter will then find it difficult 
to generalize from his findings to a 
meaningful population, 

It is interesting to note that Frank- 
lin and Brozek (47) in the study cited 
above did not actually rely on simple 
randomization in selecting their six 
groups of Ss. Near equality of initial 
means and standard deviations was 
“forced” by distributing the Ss 
among the groups, not by exact pair- 
ing, but by a rough matching of high, 
moderate, and low scores from group 
to group. This attempted control of 
subsequent variation was not, how- 
ever, taken into account in the analy- 
ses of results. It seems probable that 
the size of the “error variance’’ 
might have been reduced (but with 
the loss of 2 df) if the analysis had 


been carried out for a double classifi- 
cation of data (see below), i.e., by 
adding another classification on the 
basis of initial score category. It 
should be emphasized that the writer 
is not questioning the conclusions of 
these investigators but merely using 
their study to illustrate the point that 
statistical analysis should in general 
conform to experimental design if 
maximum accuracy is to be attained. 

The single-classification design is 
somewhat limited in efficiency be- 
cause of the characteristic heterogen- 
eity of human and animal material 
used in psychological research. Al- 
though this design furnishes the 
maximum number of error df for the 
given number of observations, the 
error variance is likely to be rela- 
tively large unless the several classes 
contain a fairly large number of ob- 
servations. Frequently, a marked re- 
duction in error variance can be 
gained by a slight modification of de- 
sign. Perhaps the main usefulness of 
this design is to serve as an extension 
of the ¢ test to more than two groups. 
Not only does the analysis of variance 
evade the practical problem of carry- 
ing out a laborious number of ¢ tests 
when there are many experimental 
comparisons to be made, but it can be 
argued that the over-all F test leads 
to more dependable inference about 
possible differences among means. 
The basis for this argument is the in- 
creased reliability or precision of the 
over-all ‘“‘error’’ term as a function of 
the fact that it is based on more df 
than the error based on any two sub- 
groups. Moreover, such ¢ tests are 
not independent and “significant” 
t's tend to be found more frequently 
than indicated by the chosen level of 
co.ifidence. Thus even when all sam- 
ples have actually been chosen at 
random from the same population, 
separate ¢t tests often indicate appar- 
ent significance of differences. With 
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six samples, for example, Cochran 
and Cox (27, p. 18) state that the ob- 
served ¢ between the highest and low- 
est mean will exceed the tabled .05 
level about 40 per cent of the time. 


MULTIPLE-CLASSIFICATION AND 
FACTORIAL DESIGNS 


In the singleeclassification design it 
is possible to increase the ‘‘sensitiv- 
ity’ of the experiment, i.e., allow the 
detection of smaller differences 
among the experimental groups, by 
using a greater number of cases or by 
improving the reliability of measur- 
ing the dependent variable under con- 
sideration. A third technique for in- 
creasing the sensitivity of an experi- 
ment is by deliberately arranging the 
design so that known sources of vari- 
ability can be controlled and sepa- 
rated both from the experimental 
comparisons and from the estimate of 
experimental error. One of the major 
purposes of multiple classification in 
modern experimental design is to pro- 
vide methods for minimizing experi- 
mental error by the control and isola- 
tion of extraneous sources of varia- 
tion. Perhaps the simplest example 
of such a controlled arrangement is 
the method of pairing cases. The re- 
duction of the standard error of dif- 
ference between the means of paired 
samples, when the pairing results in 
significant positive correlation be- 
tween the samples, illustrates the 
basic procedure of increasing the 
sensitivity of an experiment by mul- 
tiple classification. Here the use of 
the correlation term in the standard 
error formula or the equivalent 
method of analyzing the distribution 
of differences between paired scores is 
exactly the same as breaking down 
the total variation of scores into the 
three mean squares: between treat- 
ments, between pairs, and residual. 

The general principle involved in 
the pairing of cases is to increase the 


homogeneity of experimental ma- 
terial by emploving the arrangement 
known in experimental agriculture as 
randomized blocks (45). This design, 
as the name implies, consisted origi- 
nally of the marking out of blocks of 
land with each experimental treat- 
ment then being randomly assigned 
to plots within each block. Each 
block is often referred to as a repli- 
cate. The resulting yields can then 
be entered in a two-way table with 
rows representing the treatments and 
columns representing the blocks. 
Analysis of variance separates three 
sources of variation: treatments, 
blocks, and error. The psychological 
analogue to the randomized block is 
seen to be either the single S who re- 
ceives all experimental treatments in 
randomized order or a group of com- 
parable Ss, each of whom is randomly 
assigned to one of the experimental 
variations. In animal experiments 
the block may consist of litter mates, 
thus allowing the control of variation 
due to strain, age, weight, etc., while 
in experiments with humans it is 
common to form the block on the 
basis of sex, IQ, socioeconomic level, 
initial scores on the dependent vari- 
able, etc. Many examples of the use 
of randomized blocks in multiple- 
classification design will be presented 
below. In all cases a priori informa- 
tion is used in an attempt to increase 
the precision of experimental com- 
parisons by removing extraneous 
sources of variation. 

Another basis for multiple classifi- 
cation in experimental design is rep- 
resented in the so-called factorial de- 
sign. In this case the investigator is 
interested in studying the effects of a 
number of different experimental 
factors, each of which is varied in two 
or more ways. The experimental 
treatments of the factorial design in- 
volve all possible combinations of the 
factors under consideration. In dis- 
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tinction to the classical rule of hold- 
ing all but one factor constant, the 
factorial experiment depends on the 
simultaneous variation of as many 
factors or conditions as the experi- 
menter chooses to control. Not only 
is it usually difficult to keep other 
relevant conditions constant as de- 
manded by the classical single-factor 
design, but even if such control were 
attained the basis of generalization 
would accordingly be limited to the 
particular pattern of constancies 
maintained in the given study. 
Fisher (45) stresses the greater effi- 
ciency and comprehensiveness of the 
factorial study. Efficiency is derived 
from the fact that several factors 
may be evaluated with the same pre- 
cision and by fewer observations than 
would be the case in carrying out 
separate studies for each factor. 
Greater comprehensiveness comes 


from the possibility of evaluating not 


only the over-all effects of each of the 
factors but their interactions as well. 
A broader basis of inductive general- 
ization is derived from the considera- 
tion that each factor is evaiuated, not 
with other factors kept arbitrarily 
constant, but over the range of varia- 
tion of the other factors involved in 
the experiment. Because of these 
unique properties, psychological ex- 
perimentation is becoming increas- 
ingly characterized by the use of fac- 
torial designs, often in combination 
with the principle of randomized 
blocks. 

The distinction between multiple- 
classification designs and factorial de- 
signs in psychological research (32, 
34, 49) is sometimes difficult to make. 
Baxter (6) has presented a discussion 
of this distinction. If a given rubric 
of classification can be taken to rep- 
resent variation either on a quantita- 
tive scale, i.e., different amounts, 
degrees, or levels of a variable, or on 
a qualitative coritinuum, i.e., differ- 


ont categories of a set of experimental 
conditions or treatments, the particu- 
lar classification may be called a 
“factor.’’ On the other hand, if the 
axis of classification does not repre- 
sent a quantitative or qualitative 
variable, e.g., subjects, months, 
schools, that particular classification 
would not usually be referred to as a 
factor in strict parlance. It should be 
emphasized, however, that this literal 
conception is not widely adhered to 
and the term “factorial design’ is 
rather loosely employed, not only by 
psychologists but also by many sta- 
tisticians. In any case the analysis of 
multiple-classification and factorial 
designs generally involves analogous 
procedures. 

The designs falling in the category 
of multiple classification are most 
simply referred to in terms of the 
number of classifications of the data 
or in terms of the number of factors 
involved. Thus one may refer to two- 
way, three-way, etc. classifications or 
two-factor, three-factor, etc. de- 
signs. Other terms which are some- 
times used are complex design, three- 
part, four-part, etc. analysis of vari- 
ance or, simply, higher-order classifi- 
cations. 

There are several major subcases in 
multiple-classification and factorial 
design. The simplest case is that in 
which there is but one replication, 
each subclass containing a single ob- 
servation. In this case the estimate 
of experimental error is provided by 
the highest order interaction term. 
The second case is the design where 
the subclasses of the multiple classifi- 
cation or each unique factorial combi- 
nation contain equal numbers of ob- 
servations. The third case entails 
frequencies in the subclasses which 
are proportionate with the marginal 
totals. And, finally, there is the com- 
plex case where the subclasses con- 
tain unequal and disproportionate 
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numbers of observations. Examples 
will be provided below for each of 
these variations in fundamental de- 
sign. 

Double classification with one obser- 
vation per subclass. Carpenter (17) 
carried out a study of the effect of 
prolonged visual search, submitting 
his results as evidence that rate of 
blinking can be used as a criterion of 
visual efficiency. Twenty Ss were en- 
gaged in a visual task (Mackworth’s 
Clock Test) where they responded to 
a specified cue by pressing a key 
twelve times during each half-heur. 
The measure analyzed was the num- 
ber of eyeblinks during a two-hour 
run. The mean number of blinks per 
minute in each half-hour was calcu- 


TABLE 2 


MEAN NUMBER OF BLINKS PER MINUTE 
FOR Eacu 5S IN Eacu 
Hatr-Hour (17) 


Blink Rate per Minute 


Sub- 


ject Half-Hours 


Mean 





18 ' 16. 18. 
19 $9. 5 42. 
20 a ane 16. 


Mean 17.4 21. af. 24. 


Analysis of Variance 
Source of Variation 
Between half-hours 
Between subjects 
Residual (error) 


Total 
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lated for each S and these means were 
treated as single observations. The 
analysis appeared as in Table 2. 

It should be noted in Table 2 that 
the error estimate is actually based 
on the interaction between _half- 
hours and Ss. This example illus- 
trates the general form of the double- 
classification design where there is 
one observation in each subclass, but 
in this case the Ss cannot be regarded 
as ‘‘randomized blocks’’ since the 
columns represent successive periods 
of time and not a random arrange- 
ment of different experimental condi- 
tions. Although such a refinement 
was not apparently necessary in this 
study to demonstrate the “signifi- 
cant”’ increase in blinking rate, in 
some cases it may serve to make the 
comparison of successive time pe- 
riods more sensitive if individual 
variations in time regression are 
taken out of the “‘error’’ term as de- 
scribed in the section on ‘‘Repeated 
Measurements”’ presented below. 

Similar two-way classifications, one 
axis representing Ss and the other 
based on successive periods of time, 
were used by Siegel and Stuckey 
(122) in a study of the diurnal course 
of water and food intake in rats. The 
use of a double-classification design 
with Ss operating as randomized 
blocks is found in a study by Chap- 
anis, Rouse, and Schachter (18) of 
the effects of various kinds of inter- 
sensory stimulation on form discrimi- 
nation at low brightness. Although 
only three Ss were used in this latter 
study, with each S receiving a ran- 
dom arrangement of six experimental 
conditions, the design illustrates how 
an overwhelming amount of consist- 
ent individual differences may be 
separated from the estimate of ex- 
perimental error by the use of Ss as 
randomized blocks. Another example 
of a double-classification design with 
Ss as one criterion of classification is 
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provided by Postman (111) in an 
experiment relating the efficiency of 
recognition of nonsense syllables to 
number of correct and incorrect items 
in the recognition tests. Double-clas- 
sification designs with three Ss as one 
axis of classification were also used by 
Mann and Passey (100) in a study of 
adjustment to the postural vertical 
as a function of magnitude of tilt and 
duration of exposure. Although this 
study was factorial in design (8 dura- 
tions of exposure time and 6 varia- 
tions of tilt), the investigators neg- 
lected the opportunity of evaluating 
possible interaction between tilt and 
exposure time by treating the two 
factors in separate double-classifica- 
tion analyses. 

Double classification with equal 
numbers of observations per subclass. 
In this design the double classifica- 
tion is replicated so that there are 
equal numbers of observations within 


the subcells. This availability of rep- 


lication allows the “interaction” 
term used as “error’’ in the preceding 
design to be itself tested against the 
residual ‘‘within cells’’ mean square. 
In some studies the experimenter 
may be particularly interested in 
possible interaction effects and it is 
impossible to make a judgment about 
the possible significance of such ef- 
fects without some form of replica- 
cation. Chapanis and Leyzorek (19) 
employed this design in a study on 
accuracy of visual interpolation. 
Eleven Ss were given randomly ar- 
ranged trials where the task was to 
estimate the position of stimuli by 
means of 11 different numerical 
scales. Standard deviation scores 
based on 25 estimates with each of 
two different instruments were com- 
puted for each S for each scale and 
the two resulting scores were treated 
as replications within the subclasses.? 
The form of analysis is indicated in 
Table 3. Although the investigators 


chose to consider the two scores 
within each subclass as simple repli- 
cations leading to the analysis pre- 
sented in Table 3, a somewhat more 
informative analysis might have been 
made by treating the experiment as a 
triple-classification design with Ss as 
one axis of classification. Since each 
S was tested ‘‘randomly” on the 
same two instruments, it would ap- 
pear that the instruments could be 
used as a third axis of classification in 
the form presented below in Table 4. 
If this had been done, the total df 
would have been allocated as follows: 
10 df for Ss, 10 df for scales, 1 df for 
instruments, 100 df for interaction 
between Ss and scales, 10 df for inter- 
action between Ss and instruments, 
10 df for interaction between scales 
and instruments, and 100 df for the 
triple interaction. 

Two-factor designs with equal num- 
bers of observations in the subcells. 
This design involves the investiga- 
tion of the effects of two factors, each 
of which is varied over a designated 
number of levels. Equal numbers of 
different Ss are randomly assigned to 
each of the several factorial combina- 
tions. Whereas in the preceding 
double-classification design with Ss as 
one of the axes of classification the 
differences between the effects of the 
experimental treatments were associ- 
ated with intrasubject variation, in 
this design differences in treatment 
effects are associated with intersub- 
ject variation. The basic estimate of 
experimental error is derived from 
differences in response of Ss sub- 
jected to the same experimental con. 
ditions. The presence of possible in- 
teractions between the two factors 


2 It should be noted that this example con- 
sists of an analysis of variance of a set of sam- 
ple standard deviations. Bartlett (5) recom- 
mends that the analysis be carried out with a 
logarithmic transformation of variances in 
such cases. 
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TABLE 3 


STANDARD DEVIATIONS OF RELATIVE ERRORS OF ESTIMATION 
FOR EACH SUBJECT AND EACH ConDITION (19) 





(The entry in each cell is based on 25 estimates.) 





Number Scale 


Instru- 





ment 


1000 2000 





Mean 
10000 





3.9 
3.9 


Mean* 4.63 


3.42 
3.19 


4.49 


3.34 
3.66 


3.95 
2.59 


3.89 








Source of Variation 
Between subject means 
Between scale means 
Subject-scale interaction 


Analysis of Variance 








Between instruments within cells 


Total 





* Mean, Instrument 1=5.21; Mean, Instrument 2 =5.08, 


may be evaluated when this design is 
used. Analytic procedure is the same 
as for the double-classification design 
with equal numbers of observations 
in the subcells. Many examples of 
this so-called replicated two-factor 
design were found in the literature. 
Kimble and Bilodeau (85) em- 
ployed a 2X2 factorial design in a 
motor learning study in which initial 
and final scores on the Minnesota 
Rate of Manipulation Test were 
analyzed as a function of two condi- 
tions of work and two conditions of 
rest, with 24 Ss in each of the four 
possible combinations of conditions. 


Other examples were a 2X3 design 
with eight Ss per combination used 
by Norris and Grant (108) in a study 
of eyelid conditioning as a function of 
inhibitory or passive instructions and 
three conditions of reinforcement; a 
2X2 design with six Ss per cell usec 
by Lawrence and Miller (89) in in- 
vestigating resistance to extinction as 
a function of two variations in num- 
ber of reinforced trials and two 
amounts of reinforcement; a 4X4 de- 
sign with five Ss per cell applied by 
Grant and Schneider (60) in a study 
of the magnitude of GSR response 
during extinction as a function of 





VARIANCE DESIGNS IN PSYCHOLOGICAL RESEARCH 9 


four levels of CS intensity during 
both reinforcement and extinction; a 
4X4 design with four Ss per cell used 
by Grant and Schneider (59) in 
studying the relation of intensity and 
frequency of a conditioned eyelid re- 
sponse during extinction to four vari- 
ations in intensity of CS during rein- 
forcement and extinction; a 3X3 de- 
sign with ten Ss per cell by Chernikoff 
and Brogden (20) in a study of the 
effects upon sensory conditioning of 
three variations in pretraining treat- 
ment and three types of instructions; 
and a 2 X2 design with 20 Ss per cell 
used by Grant, Norris, and Boissard 
(57) in studying the change in mean 
magnitude of eyelid response from 
pretest to posttest as a function of 
the presence or absence of dark 
adaptation and the presence or ab- 
sence of pseudo-conditioning rein- 
forcement. 

Double-classification or two-factor 


designs with unequal but proportionate 
numbers of observations in the sub- 
classes. This design differs from the 


usual two-way classification de- 
scribed above in that the numbers of 
observations in the subclasses, al- 
though not the same, are proportion- 
ate for each row and column to the 
numbers of observations in the mar- 
ginal totals. For example, a 2X2 
table with one row having subclasses 
containing two and four observations 
and the second row containing three 
and six observations would fit this 
description. The analysis of variance 
for this design offers 20 computa- 
tional difficulties since the SS for 
rows, columns, row by column inter- 
action, and within subclasses are ad- 
ditive to the total SS. The only cor- 
rections necessary for the unequal 
entries in the subclasses are the same 
as those used in the analysis of the 
single-classification design with un- 
equal numbers of cases in the several 
groups (124). Webb (130) used this 


design in a study of the strength of a 
food-reinforced response as a func- 
tion of varying conditions of an irrele- 
vant drive. The irrelevant drive in 
this case was thirst, and one classifi- 
cation of his data was based on the 
setting up of four independent groups 
with different periods of thirst dep- 
rivation. Each of these groups con- 
tained a total of 18 rats. The second 
classification was based on differenti- 
ating the sex of the S. The unequal 
but proportionate subclass frequen- 
cies resulted from the fact that each 
group consisted of 10 males and 8 fe- 
males. 

In his analysis Webb made the as- 
sumption that the within-subclass 
mean square provided the appropri- 
ate estimate of error variance for 
testing the over-all significance of 
differences among the four major 
experimental groups. This was a 
warranted procedure since in all of 
the measures analyzed (latency, 
extinction) the F ratios of group- 
by-sex-interaction mean square to 
within-subclasses mean square were 
“‘nonsignificant.”’ In some studies, 
however, the experimenter might find 
that the ‘interaction’ is “significant,” 
and he may then desire to test the in- 
trinsic effect of the main classification 
under the assumption that the ap- 
propriate error term should include a 
compounding of both interaction 
variation and within-subclass varia- 
tion. In the ordinary case where the 
numbers of entries in each subclass 
are equal, this is done simply by 
forming an F ratio of main effect 
mean square to interaction mean 
square. In the present design, how- 
ever, where the numbers of observa- 
tions in the subclasses are unequal 
but proportionate, Smith (123) has 
recently called attention to a qualifi- 
cation in procedure when the investi- 
gator desires to test the significance 
of a main effect over and above the 
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variation due to possible interaction. 
The appropriate method for carrying 
out this test of significance is some- 
what complex and involves setting 
up an F ratio consisting of multiple 
terms in both numerator and de- 
nominator. 

An example of a two-factor design 
with proportionate subclass N’s is 
also reported by Kelman (82). As in 
Webb’s study, the interaction was 
not found to be significant and no 
complications developed in testing 
the main effects. 

Double-classification or two-factor 
designs with disproportionate numbers 
of observations in the subclasses. The 
situation sometimes arises in experi- 
mentation or investigation where the 
numbers of observations in each of 
the subclasses of a multiple-classifi- 
cation design are not only unequal 
but also disproportionate with the 
marginal totals. In such cases the 
simple corrections for unequal sub- 


class frequencies which are applied in 
single-classification designs or mul- 
tiple-classification designs with pro- 
portionate frequencies are no longer 


adequate. Such a state of affairs 
may arise because of various reasons, 
e.g., failure of Ss to meet appoint- 
ments, loss of animals, type of in- 
vestigation, etc. Such designs are re- 
ferred to as the “nonorthogonal 
case” because the estimates of vari- 
ance computed for the several sources 
of variation are interdependent (124). 
Thus in a 2X2 classification if one 
were to calculate separately the SS 
for columns, rows, column-by-row in- 
teraction, and residual he would find 
that these SS would not generally 
add to the total SS. 

In the simplest case only one or 
two items of data may be missing 
from some of the cells. The common 
method for estimating a small num- 
ber of missing entries and filling out a 
table was developed by Yates (132) 


and is readily accessible in Snedecor 
(124), Anderson(3), and Cochran and 
Cox (27). The general problem of 
analyzing tables of multiple classifi- 
cation with disproportionate subclass 
numbers is discussed by Lindquist 
(95) and Johnson (76), but in the ab- 
sence of specific cautions, students 
referring to McNemar (101) and Ed- 
wards (37) may incorrectly infer that 
the corrections described for single- 
classification inequality of frequen- 
cies are sufficient. A number of dif- 
ferent solutions to the problems of 
disproportionate frequencies have 
been proposed, all of which involve 
approximations based on varying as- 
sumptions. Snedecor (124) has pre- 
sented a comprehensive summary of 
the so-called methods of fitting con- 
stants, unweighted means, expected 
subclass numbers, and_ weighted 
squares of means. In these methods 
it is generally assumed that the usual 
within-subclasses SS furnish an ap- 
propriate estimate of error variance. 
Tsao (128), on the other hand, has de- 
rived solutions where this assumption 
is not made. Two of the basic de- 
cisions which the investigator must 
always make in selecting a solution 
are whether or not interaction is “‘sig- 
nificant’”’ and whether or not dispro- 
portionality is characteristic of the 
inferred population. 

In employing multiple-classifica- 
tion designs with disproportionate 
subclass frequencies, some psycholo- 
gists have taken cognizance of the 
special methods necessary for this 
case while other studies have been re- 
ported in which no apparent adjust- 
ments were made. Bray (11) used 
corrections suggested by Snedecor 
(124) in analyzing conformity scores 
in an autokinetic situation of 2X2 
design, where unequal numbers of Ss 
were classified according to racial at- 
titude and whether or not the con- 
federate was a member of a specified 
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race. Porter, Stone, and Eriksen 
(110) also used Snedecor’s methods 
in analyzing 2X3 and 2 X9 designs in 
a study where maze error scores were 
being compared for rats given elec- 
troconvulsive shocks in late infancy 
and control litter mates. Some stud- 
ies in which the analysis apparently 
failed to take adequate account of 
disproportionate subclass frequencies 
were a 2X2 design by Jenkins and 
Postman (75), a 3X3 design by 
Postman and Jenkins (112), a 2X10 
design by Hunt, Schlosberg, Solo- 
mon, and Stellar (72), and 2X2 and 
3X2xX2 designs by Citron, Chein, 
and Harding (24). 

In some experiments the investi- 
gator has sought to evade the prob- 
lem of disproportionate subclass fre- 
quencies by ignoring the individual 
observations in the cells and analyz- 
ing the data as if there were no repli- 
cation, i.e., analyzing subclass means 
as if they were single observations. 
In general this procedure cannot be 
rigorously defended, especially when 
the frequencies are markedly dis- 
similar, since such means are differ- 
entially reliable and nonorthogonal- 
ity is still inherent in the data. 

Triple-classification designs with 
subjects as a criterion of classification. 
In many essentially two-factor de- 
signs a third axis of classification is 
provided by the fact that each S§ 
undergoes all of the experimental 
variations or conditions, frequently 
in random order. Because ‘between 
subjects” is considered a major source 
of variation in such designs, there is 
no ‘‘within subclasses” estimate of 
error and the basic estimate of experi- 
mental error is provided by the 
triple- or second-order interaction 
mean square. Such a design was used 
by Solomon (125) in a study of the 
effect of effort upon distance dis- 
crimination, where ten rats went 
through four successive experimental 


sessions, alternately running a maze 
with and without a load over a period 
of eight days. Analysis of the ordinal 
number of the side alley first entered 
during each session followed the form 
of Table 4. In this case the Ss can- 
not be regarded as “randomized 
blocks” since each received the same 
sequence of experimental variations. 

By way of didactic comment about 
Solomon's analysis, the 9 df for rats 
might have been separated into one 
df for sex and 8 df for rats within sex 
groups. Possible sex difference might 
then have been evaluated by means 
of an F ratio derived from these two 
sources of variation. ,Furthermore, it 
should be noted that days (not ana- 
lyzed) are confounded with sessions. 
Finally, the comparison of effort 
levels is also confounded with days 
since performance under the condi- 
tion of “load”’ as a whole took place 
one day later than performance with- 
out the load. In this case, however, 
the general temporal trend was to 
enter a more remote alley and this 
was an opposing trend to the tend- 
ency exhibited under load. One 
would thus predict that the apparent 
difference between effort levels might 
have been even greater, had the two 
experimental conditions been ran- 
domized for each S. 

Littman (97) used a similar 4x4 
X11 design with two groups of 11 Ss 
in a study of the generalization of a 
conditioned GSR to tones other than 
the original CS. Other applications 
of this design were made by Black 
(10) in a 5X2 X25 study of inten- 
sity of oral responses to two types of 
messages under five levels of intensity, 
and by Beebe-Center, Black, Hoff- 
man, and Wade (7) ina 3X12 X9 in- 
vestigation of per diem consumption 
as a measure of preference in the rat. 

Trifactorial designs with one obser- 
vation per subclass. The analysis of 
the triple-factor design with one ob- 
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TABLE 4 


OrDINAL NUMBER OF S1pE ALLEY First ENTERED BY RATS 
DURING THE E1Gut Test Sessions (125) 
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servation per subclass is analogous to 
that shown in Table 4, with the re- 
placement of rats, i.e., subjects, by 
the third factor. Helson (70) utilized 
this design in analyzing a 2X4X10 
factorial experiment where time er- 


rors with handwheels were classified 
according to wheel diameter, amount 
of friction, and speed of turning. Ac- 
tually, in this study subcell values 
were averages for different groups of 
six Ss each, 
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Trifactorial designs with replica- 
tions. The presence of equal numbers 
of observations in the subclasses of a 
three-factor design affords a within- 
cells residual which can be used in 
testing the significance of the triple 
interaction. Wilson (131) used this 
design in a study of the frequency of 
remote associations at recall for rote 
learning. His application, using 
three Ss in each combination of a 
4xX4xX3 design, yielded an analysis 
as in Table 5. 

Gebhard (50) used a similar design 
in a 2X2 X2 study investigating at- 
tractiveness rankings of tasks classi- 
fied according to experience (success- 
failure), expectation of task difficulty, 
and strength of need. Other investi- 
gators using this design were Grant 
(55) in a 2X2 X2 factorial study of 
responses to a card sorting task; 
Grant, Hornseth, and Hake (65) in a 
2xX2X2 study of the influence of 
intertrial interval on the Humphreys’ 
effect with verbal responses; and 
Grant and Mote (63) in another 
2X22 study of the effects of brief 
flashes of light upon dark adaptation. 
Lawrence (88) reported a study in- 
volving a 2X2 X2 design in which 
certain comparisons were confounded 
because of the nature of the experi- 
mental design. A study by Conklin 
(28) which apparently involved a 
three-factor design in an investiga- 
tion of the effects of temperature, 
duration of session, and adaptation 
on skin resistance presents an alloca- 
tion of df which is difficult to recon- 
struct. 

The problem of disproportionate 
subclass frequencies with tabies of 
multiple classification again arises in 
this design. In a 2X25 design used 
by Bendig and Braun (8) for studying 
maze behavior, adjustments were 
made both for missing cell entries and 
for differing subgroup N’s according 
to suggestions by Snedecor (124), 


Anderson (3), and Schoenfeld (121). 
However, in a 2X4X2 study by 
Newman and Scheffler (107) con- 
cerned with sex differences in emo- 
t.onal reaction to the news, where 
sex, educational level, and type of 


TABLE 5 


FREQUENCY OF REMOTE ASSOCIATIONS 
AT RECALL (131) 


Interval 
between 
Learn- 
ing & 
Recall 
(mins. ) 

6 

0 30 

60 


Spacing Degree of Learning 
between (% of perfect anateadennetn 
Trials ———— 


(secs.) 50 





6 
30 
60 
6 
30 
60 
6 
30 
60 —— 


Analysis of Variance 





Source of Variation 

Degree of learning (D) 
Intervals following learning (I) 
Conditions of spacing (C) 
Dx! 

DxC 

IxC 

DXIxC 

Within cells 


Total 


* 3 Ss per cell; |; data not provided. 


newspaper were treated as major 
sources of variation, there is no evi- 
dence that account was taken of the 
markedly disproportionate frequen- 
cies in the subclasses. 

Quadruple and higher classification 
designs. These designs represent 
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further elaboration of the principles 
already described. In some studies 
all of the classifications can be re- 
garded as factors while in other 
studies one of the classifications of 
the data depends upon the fact that 
each S undergoes every variation of 
experimental combinations. Quad- 
ruple-classification 4X22 X2 de- 
signs were used by Preston, Spiers, 
and Trasoff (116) in a level of aspira- 
tion study. Grant, Hornseth, and 
Hake (61) applied a 5X2 X4X40 de- 
sign, with Ss as one criterion of classi- 
fication, in a study of sensitization of 
the beta-response to visual stimuli. 
Littman (98) used a 3X22 X6 de- 
sign in a latent learning experiment, 
while Horowitz (71) applied several 
10X4X4X2 designs with Ss as one 
classification in a study of visual 
acuity. Child and Grosslight (23) 
made use of a 3X22 X2 factorial 
design in a study of substitute activ- 
ity with the added complication of 
breaking down one of the factors into 
a major and minor subclassification. 
A five-way 7X102X4X 14 classifi- 
cation was used by Kuntz and Sleight 
(87) in a study of legibility of numer- 
als as a function of height/width 
ratio, type of numeral, background, 
and brightness. The highest number 
of criteria of classification found in 
the literature surveyed was applied 
by Licklider, Bindra, and Pollack 
(94) in a study comparing the intelli- 
gibility of normal and ‘“‘square”’ 
speech. Two “talkers’’ and two 
“listeners” furnished two of the 
major criteria of classification in a 
2xX5&2x2xX3X10 design. The au- 
thors present an interesting argument 
for the rationale of generalizing from 
such a small number of Ss. 
Comment. This section has dealt 
with the possibilities for increasing 
the precision and scope of experi- 
ments by use of randomized blocks 
and factorial design. In planning an 
experiment involving the comparison 


of the effects of several experimental 
variations, the investigator must al- 
ways decide whether to use the same, 
matched, or different Ss for the vari- 
ous treatments or treatment combi- 
nations. If the same or matched Ss 
undergo all treatments in randomized 
order as in the usual factorial design, 
it is often possible to increase the pre- 
cision of experimental comparisons 
by removing variation associated 
with over-all differences among such 
“‘blocks.”” Assuming that the total 
number of observations is the same, 
such an advantage must be weighed 
against the broader basis for general- 
ization which is derived from the use 
of a larger number of randomly as- 
signed Ss. In experiments where 
naiveté is essential for Ss undergoing 
a given treatment it is obvious that 
the design should contain different 
Ss in each of the subclasses. Simi- 
larly, wide individual variations in 
practice or fatigue effects in the de- 
sign where each S undergoes all ex- 
perimental combinations would tend 
to result in marked interactions be- 
tween Ss and treatments, thus tend- 
ing to obscure differences in the main 
effects of the several factors. If tem- 
poral variation is itself a main sub- 
ject of investigation little would be 
gained from the conclusion that Ss 
show consistent temporal trends 
when each S has undergone several 
experimental treatments in random- 
ized order. The following section on 
“Repeated Measurements” will pre- 
sent some common useful designs 
when temporal trend is a main topic 
of study. 

Factorial designs, often involving a 
fairly sizable number of factors, have 
become very prominent in recent 
psychological research.* In the main, 


* Edwards and Horst (40) have facilitated 
the computations involved in higher-order 
multiple-classification designs by furnishing a 
method for the direct calculation of second- 
order and higher interaction SS. 





VARIANCE DESIGNS IN PSYCHOLOGICAL RESEARCH 15 


such designs have been a boon to ex- 
perimental methods because they 
allow the systematic, economical ex- 
ploration of the effects of a number of 
different factors as well as possible 
interactions among the factors. Pro- 
grams of research, sequentially in- 
vestigating the effects of varying one 
experimental factor at a time, such as 
characterized the field of learning in 
the past, can be immeasurably hasten- 
ed and increased in generality by the 
application of factorial designs. On the 
other hand, there seems to be a tend- 
ency on the part of some experi- 
menters to sacrifice considerations of 
sample size, representativeness of 
samples, and both reliability and 
validity of measurement in their 


enthusiastic endeavor to test large 
numbers of hypotheses by means of 
factorially designed experiments. At 
the extreme, for example, a complex 
factorial study providing many df's 


for making many tests of significance 
might be carried out for a single S. 
Multiple observations could be se- 
cured for each subclass by measuring 
the dependent variable several times 
for each treatment combination. The 
precision of such an _ experiment 
might be very high and conclusions 
valid for the unique S, but who would 
attempt to generalize from the re- 
sults, whether null hypotheses were 
rejected or not? The writer dis- 
covered no instance of the use of a 
single S in factorial design, but many 
investigators have reported experi- 
ments in which broad inferences were 
drawn from less than a half dozen Ss. 

Although, in principle, there is no 
limitation on the number of experi- 
mental factors which may be in- 
volved, difficulties frequently arise in 
the interpretation of complex fac- 
torial designs. The number of treat- 
ment combinations increases very 
rapidly and often limitations in ap- 
paratus or other circumstances cause 
a large-scale experiment to stretch 


out over a considerable period of 
time. The classic example of a fairly 
elaborate factorial experiment in 
psychological research is that of 
Crutchfield (32, 33). In this study 
the topic of investigation was string- 
pulling in rats as a function of five 
factors, each varied over three levels. 
A single animal was assigned to each 
of the 243 treatment combinations. 
Complete analysis would yield a list 
of 31 mean squares: 5 main effects, 10 
two-factor interactions, 10 three-fac- 
tor interactions, 5 four-factor inter- 
actions, and 1 five-factor interaction. 
In such a case it is generally assumed 
that interactions involving three or 
more factors can be ‘‘pooled”’ to pro- 
vide an adequate estimate of experi- 
mental error. Fisher (44) describes 
this procedure of dispensing with ab- 
solute replication in estimating error 
as the method of “hidden replica- 
tion’’ and points out the possibilities 
of loss of precision in tests of signifi- 
cance when high-order interactions 
are not really negligible. 

Whether high-order interactions 
can ordinarily be assumed to be un- 
important in psychological research 
is problematic, but it is certain that 
they cannot be evaluated when there 
is no replication. Furthermore, even 
with replication they frequently pre- 
sent puzzling problems of interpreta- 
tion to the experimenter and, since 
large psychological studies are rarely 
repeated, there is little opportunity 
to compare their consistency over a 
series of experiments. 

Psychologists in general have not 
paid much attention to the practical 
and experimental advantages of 
“confounding” in the planning of ex- 
periments. Confounding in this con- 
nection refers to the deliberate ar- 
rangement of the experiment so that 
certain mean squares represent the 
effects of more than one known 
source of variation. Experimenters 
often go to great length to avoid the 
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possibility of confounding experi- 
mental factors, sometimes to the con- 
siderable enlargement of their stud- 
ies, even when previous studies have 
fairly well demonstrated that the fac- 
tors concerned do not interact. Of 
similar character is the practice of 
running all possible combinations in a 
factorial experiment and then com- 
bining high-order interactions to esti- 
mate experimental error. The basic 
principle of deliberate confounding is 
to use “incomplete blocks,” i.e., 
blocks within which all treatment 
combinations do not occur (45). In 
general, the purpose of such con- 
founding is to increase the precision 
of selected experimental comparisons 
while sacrificing the possibility of 
evaluating other comparisons, e.g., 
high-order interactions. 

A simple illustration will clarify 
the basic idea in deliberate confound- 
ing. Let us suppose that we have a 
three-factor experiment, each factor 


being varied over two levels. Repre- 
senting factors by letters and levels 
by subscripts the eight possible com- 
binations may be separated into two 


subgroups: (a) AiB,Ci; A:BeCo; 
A2B,C2; AoB.C and (b) A,B,Co: 
A,B:C;; AoB,C; AsB2Co. In the 
usual complete factorial experiment 
where each S serves as a block, every 
S would undergo all eight experimen- 
tal treatments. Let us, however, 
modify the design so that five Ss 
undergo all the combinations listed 
after (a) while five other Ss undergo 
those listed after (b). Each S would 
then represent an incomplete block. 
The resulting analysis of variance 
would then allot 1 df each to A, B, C, 
AB, AC, and BC; 9 df to Ss; and 
24 df to the error estimate. 

The single-factor and two-factor 
effects are not influenced by differ- 
ences among Ss (blocks) while the 
three-factor interaction ABC is com- 
pletely confounded with these differ- 


ences. In the usual ‘“‘complete’”’ ex- 
periment this latter interaction would 
have been estimated from the differ- 
ence of (a) and (b) above. In sucha 
case the decision to employ the con- 
founded design might be based upon 
the fact that each S is available for 
only half of the experimental sessions, 
a desire to avoid fatigue or boredom 
on the part of Ss, or any other reason 
which might justify halving the ex- 
perimental period for each S. The 
aim in this particular design is to re- 
duce the error variance used to test 
the significance of the main effects 
and two-factor interactions by sacri- 
ficing the second-order interaction. 

The reader should not conclude 
that confounding is possible only 
when individual Ss serve as blocks. 
Confounded designs may at times be 
fruitfully employed when each S 
undergoes only one experimental 
combination. Nor are such designs 
limited to the confounding of high- 
order interactions. Baxter (6) has 
discussed various possibilities for in- 
creased precision in experimental re- 
search through the use of confound- 
ing. In the main, however, the most 
comprehensive presentations of de- 
signs involving deliberate confound- 
ing are found in sources dealing pri- 
marily with experimental agriculture 
(27, 45, 83, 124). The following are 
some hypothetical examples of situa- 
tions where the experimenter might 
consider the possible advantages of 
confounding by means of “incom- 
plete” blocks: 


1. The Ss fallinto homogeneous groups, 
e.g., by sex, age, family, IQ, school, vis- 
ual acuity, etc., but there are insufficient 
Ss in each group to allow carrying out all 
treatment combinations. A common ex- 
ample is the limited size of groups of litter 
mates. 

2. Several experimenters might simul- 
taneously handle portions of the entire 
program, thus speeding up completion of 
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the experiment. General differences in re- 
sponse of Ss due to the experimenters 
could be removed by confounding the 
blocks (experimenters) with unimportant 
interactions. The same principle might 
be applied when different machines are 
used to present experimental stimuli or in 
experiments which involve the use of con- 
federates. 

3. In some experiments time and space 
considerations might determine the sepa- 
ration of blocks. Several different experi- 
mental rooms may be involved or rele- 
vant environmental conditions may vary 
fromday today. Treatmentcombinations 
belonging to a block could be compared 
with greater precision than would be pos- 
sible if such sources of heterogeneity were 
ignored. 


In all of these cases it is presumed 
that there is significant block-to- 
block variation with respect to the 
cependent variable being studied. 


REPEATED-M EASUREMENTS 
DESIGNS 


In many studies the experimenter 
is especially interested in analyzing 
successive changes in measures ob- 
tained repetitively from one or more 
groups of Ss. Some of the types of in- 
vestigation in which the problem of 
repeated measurements arises are (a) 
examination of learning and extinc- 
tion data; (b) studies of dark adapta- 
tion; (c) investigations of perform- 
ance and fatigue; (d) analysis of se- 
quential measures of physiological or 
sensory-motor functions for varying 
treatment groups. The repeated- 
measurement situation is so common 
in psychological research that Ed- 
wards (37) devotes an entire chapter 
to this topic. Several articles have 
been largely devoted to this type of 
design (1, 86, 96). 

Single group with repeated measure- 
ments. If there is but a single group 
of Ss, the investigator may be pri- 
marily interested in determining 
whether the group in general shows a 


significant trend during the succes- 
sive trials or periods. The simplest 
method of analysis is a double-classifi- 
cation design where rows represent 
different Ss and columns represent 
successive trials. If the F test for 
trials mean square over the Ss X trials 
interaction mean square is “signifi- 


TABLE 6 


MEAN MAGNITUDE (Mm) or CRs AvrEr- 
AGED FOR SUCCESSIVE FIVE-TRIAL 
BLocks DurRIncG First Day 
REINFORCEMENT TRIALS* 

(108) 


Successive Five-Trial Blocks 


Sub- 
ject 


1-5 6-10 





36-40 41-45 





6 

7 

8 : ie 
Mean 0.00 0.00 





Analysis of Variance 


Source of Variation df 
Group slope 1 
Between individual means 7 
Between individual slopes 7 
Individual deviations from 

linearity 56 


Total 71 


* Data in body of table not provided. 


cant,”” one concludes that there is 
nonrandom trial-by-trial variation, 
i.e., trial means are not the same (see 
Table 2). Such an analysis, how- 
ever, does not indicate whether the 
trial means follow a regular linear or 
curvilinear trend. In order to ‘‘test’”’ 
for the presence of a consistent trend, 
one must fit curves to the data, tak- 
ing into account both individual and 
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group regressions upon the time scale. 
One of the methods suggested by 
Alexander (1) was applied by Norris 
and Grant (108) in a study of eyelid 
conditioning to test the statistical 
significance of group slope in a design 
involving nine successive five-trial 
blocks for a group of eight Ss. The 
measure analyzed was mean magni- 
tude of a CR and the analysis ap- 


TABLE 7 


1.oG LATENCIES FOR ACQUISITION OF THE 
RUNNING RESPONSE* (93) 


Acquisition Trials 
Rats ———_——_—_—____—__—_—______—_- 
Room go ae Be - a  F 


Group A: Running First, Bar- 
Pressing Second 


Group B: Bar-Pressing 
Running Second 


Analysis of Variance 





Source of Variation 
Trials (T) 

Between groups (G) 

Between Ss in same group (S) 
Interaction: T XG 
Interaction: T XS 


Total 


* Data not provided. 
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peared as in Table 6. Alexander (1) 
points out that in some cases appar- 
ent “significance” of group slope may 
be attributable to wide variations in 
individual slopes. 

Independent groups with repeated 
measurements. A more complex case 
with respect to repeated measure- 
ments occurs when several independ- 
ent ‘treatment’ or ‘‘methods”’ 
groups are involved and the investi- 
gator wishes to compare the trends 
exhibited by the several groups. If 
the assumption, among others, is 
made that individual regressions are 
parallel, analysis is readily made in 
terms of a double-classification design 
with the between-Ss variation being 
subclassified into a between-treat- 
ments source of variation and a be- 
tween-Ss_ within-treatment-groups 
source of variation (37, 86, 96). This 
procedure of decomposing composite 
classifications or variables “nestled” 
within other variables will frequently 
prove to be valuable in the complete 
analysis of many experimental de- 
signs (cf. the ‘‘split plot’’ design de- 
scribed in detail by Cochran and 
Cox [27]). This form of analysis was 
used by Liberman (93) in analyzing 
8-trial acquisition and _ extinction 
trends for two, groups of 24 rats ina 
study of transfer effects. The vari- 
ance breakdown is shown in Table 7. 

Similar analyses of repeated meas- 
urements for independent groups 
were carried out by Furchtgott (48) 
in a study of maze swimming for 
three groups of rats exposed to differ- 
ent levels of X-irradiation, and by 
Bernberg (9) in comparing the effects 
of shock and narcosis upon maze- 
learning ability in young rats. 

The analysis of repeated measure- 
ments for independent groups takes 
on a more complex form if possible 
individual and group variations in 
linear regression are taken into ac- 
count. Alexander (1) provides an 
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analytic procedure for such a trend 
analysis. His suggestions were applied 
by Grant, Riopelle, and Hake (64) in 
comparing extinction trends for three 
groups of 15 Ss, each group being 
given a different reinforcement pat- 
tern for an eyelid CR. A trend analy- 
sis of CR magnitude scores was car- 
ried out for five successive blocks of 
trials with the analysis taking the 
form shown in Table 8. 

Similar analyses were found in 
other studies by Grant, Hake, and 
Schneider (58) and Grant and Nor- 
ris (56). 

Repeated measurements in mulltiple- 
classification designs. Many varia- 
tions of the repeated-measurements 
design may occur. The within-sub- 
classes mean square of a factorially 
designed experiment may be based on 
successive measurements of the same 
Ss, or several Ss within the same sub- 
class may be repetitively measured. 
In a sense, all designs where the same 
Ss undergo several experimental vari- 
ations are applications of the “re- 
peated-measurements”’ principle, al- 
though the term has generally been 
used to refer to the case where the 
effect of the same treatment is meas- 
ured successively over a period of 
time. Whereas in the usual factorial 
experiment with Ss as one axis of 
classification the Ss undergo the sev- 
eral experimental combinations in 
randomized order, it may sometimes 
he necessary to have all Ss undergo 
the same sequence of treatments. In 
other experiments the Ss may under- 
go the treatments in differing orders, 
but the investigator may wish to 
eliminate a general temporal effect, 
e.g., transfer, fatigue, practice effects, 
from his estimate of experimental er- 
ror. An example of a single group ex- 
periment where all of the Ss under- 
went the several treatments in the 
same order is provided by Bruner, 
Postman, and Mosteller (13). Nine- 


TABLE 8 


MEAN MAGNITUDE OF CONDITIONED 
EYELID RESPONSES FOR SUCCESSIVE 
Five-TrIAL Blocks DuRING 
EXTINCTION (64) 


Successive Five-Trial Blocks 
after First 5 Trials 


2 3 4 5 


Sub- 


ject 





Single Alternation Group 


Double Alternation Group 


100% Reinforcement Group 


Analvsts of Variance 


Source of Variation 

Over-all slope 

Over-all deviations from linear- 
ity 

Between group means 

Between group slopes 

Group deviations from estimate 

Between individual means 

Setween individual slopes 

Individual deviations from esti- 
mate 


Total 
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TABLE 9 


REVERSALS PER SUCCESSIVE ONE-MINUTE INTERVAL OF THE SCHROEDER 
STAIRCASE UNDER THREE INSTRUCTIONS (13) 








Instructions Minutes 





Subjects 








“Alternate” 


M =47.4 





“Hold” 
M =11.5 


“Natural” 
M = 21.6 


25 10 








Source of Variation 
Subjects (Su) 

Set (St) 
Interaction (Su XSt) 
Time-sequence regression 
Residual sampling variance 


Total 


teen Ss were given the task of revers- 
ing the Schroeder staircase for succes- 
sive ten-minute periods under three 
sets of instructions. Reversals per 
successive one-minute interval were 
analyzed with the time-sequence re- 
gression lines for the individual Ss 
being taken out of the total variation 
of scores as a systematic source of 
variation. The analysis of variance 
appeared as in Table 9. 

Another complex example, where 
repeated measurements were ana- 
lyzed, is found in a study by Law- 


Analysis of Variance 





rence and Miller (89). Their investi- 
gation involved a 2X2 factorial de- 
sign in which individual and group 
linear regression lines (44) were com- 
pared for groups of Ss, doubly classi- 
fied according to number of rein- 
forced trials and amount of reward. 
This study appears to be fairly 
unique in that curvilinear regression 
lines were also examined. 

Comment. Experiments involving 
the repetitive measurement of a cri- 
terion variable for one or more groups 
of Ss are characteristic of marty areas 
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of psychological research. Similar ex- 
periments are not commonly found in 
experimental agriculture and uses of 
variance analysis for repeated-meas- 
urements designs represent special 
adaptations made by psychologists. 
Perhaps the most uniquely psycho- 
logical of these applications is the 
situation where the investigator is 
particularly interested in the com- 
parison of trends, e.g., learning 
curves, for several independent 
groups subjected to different experi- 
mental conditions. Traditional meth- 
ods for comparing such trends were 
largely limited to comparisons of ex- 
perimental groups either at specified 
points of the experiment (see Table 
1) or with respect to increment or 
decrement over specified periods (1). 
Frequently, the experimenter simply 
compared the over-all means of the 
several groups, thus completely neg- 
lecting the configurations of succes- 
sive trial means. The analytic meth- 


ods cited in this section have proven 
of practical utility for the comparison 
of group trends, but in general they 
should be applied with caution since 
successive measures taken on the 
same Ss can hardly be regarded as 
either randomly distributed or inde- 


pendent. There is need for further 
theoretical work on the topic of re- 
peated measurements, preferably 
with the aid of mathematical statis- 
ticians. 

Despite the fact that repeated 
measurements often appear to exhibit 
nonlinear trends the writer found 
that few investigators go to the 
trouble of fitting curvilinear func- 
tions in carrying out analyses of vari- 
ance. Lindquist (96) and Lewis (92) 
discuss procedures for testing the 
goodness of fit of observed successive 
means to fitted curves in the case of a 
single group, but there is need for an 
expository article on the possibilities 
for comparing curvilinear trends for 


independent variance 
methods. 

Finally, it might be noted that the 
type of investigation in which all Ss 
undergo several different experimen- 
tal treatments in the same order 
should in general be avoided. The 
experiment by Bruner, Postman, and 
Mosteller (13) described in Table 9 is 
a case in point. The design is in- 
creased in sensitivity by the separa- 
tion of individual variations in re- 
gression on time from the estimate of 
error, but this does not overcome the 
confounding of differences in “‘set’’ 
means with possible temporal effects 
of fatigue, adaptation, etc. Such con- 
founding of the main experimental 
factor could have been obviated by 
randomizing the sequence of experi- 
mental conditions among the Ss. 


groups by 


THE LATIN-SQUARE PRINCIPLE 
oF DESIGN 


The fundamental principles of the 
latin-square design are described in 
many texts and in articles by Thom- 
son (127), Grant (52), and Edwards 
(38). The latin square is essentially a 
triple classification, one variable rep- 
resented by rows, the second by 
columns, and the third by treatments 
which occur once in each row and 
once in each column. As used by psy- 
chologists, the latin-square arrange- 
ment has typically been applied as a 
form of repeated-measurements de- 
sign where Ss are exposed to several 
experimental treatments and the in- 
vestigator desires to take account of 
the possibility of systematic temporal 
effects such as transfer, practice, or 
fatigue. 

Single latin-square designs. In the 
most common design using a single 
square, the criteria of classification 
are Ss (rows), trials or successive pe- 
riods (columns), and experimental 
treatments (Latin letters). While the 
latin square is a very compact form of 
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TABLE 10 


NUMBER OF CORRECT RESPONSES MADE BY FIVE SUBJECTS IN READING THE 
LuckKrEsH-Moss Low Contrast TEST CHART UNDER 
VARIOUS EXPERIMENTAL CONDITIONS (18) 


Experimental Conditions* 





Loud 
Sound 


Weak 


Subjects Sound 


21(2) 
22(4) 
14(1) 
29(5) 
16(3) 


22(3) 
16(1) 
14(5) 
24(4) 
15(2) 


A 
B 
C 
D 
EF 


20.4 


18.2 


Mean 


Heavy 
Pressure 
20(4) 
23(5) 
23(2) 
24(3) 
14(1) 


20.8 


Experimental Days 


Light 


Means 
Pressure 


Control 





22(S) 
19(2) 
24(3) 
24(1) 
15(4) 


22(1) 
23(3) 
20(4) 
28(2) 
13(5) 
20.8 


21.2 








1 


) 


7 4 


Mean 


3 


21.8 


Analysis of Variance 


4 





20.2 








Source of Variation 
Subjects 

Days 

Conditions 
Residual (error) 


Total 


* The entries in parentheses are the days on which the experimental conditions were pre- 


sented. 


design, it should be obvious that its 
application assumes that interactions 
among the variables are negligible. 
If such interactions are present, they 
are confounded with the other sources 
of variation and may serve to aug- 
ment or depreciate the apparent sig- 
nificances of effects. Chapanis, 
Rouse, and Schachter (18) employed 
a single 5X5 latin-square design in 
studying the effects of intersensory 
stimulation upon contrast sensitiv- 
ity, as measured by number of correct 
responses on the Luckiesh-Moss Low 
Contrast Test Chart. Five Ss were 
tested under five conditions (loud 





sound, weak sound, heavy pressure, 
light pressure, control) on each of five 
days. The analysis of results followed 
the form of Table 10. 

Leyzorek (91) employed a single 
7X7 latin-square design in a study 
analyzing various types of error 
scores made in visual interpolation 
between circular scale markers with 
differing sizes of scale interval. 

Replicated latin-square designs. 
Studies in which latin-square designs 
are used may be replicated by em- 
ploying randomly selected squares of 
the same size or by applying the 
same square to several groups of Ss. 
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In some cases where the number of 
experimental treatments is small it 
may be feasible to apply all permuta- 
tions of order of the several treat- 
ments with several Ss undergoing 
each sequence. The principles in- 
volved in these variations have long 
been used in experimental research 
and special types of such designs have 
been variously referred to as_per- 
muted double-fatigue orders, bal- 
anced orders of presentation, rotation 
experiments, crossover designs, 
switchback studies, ABBA orders, 
etc. The relative advantages and dis- 
advantages of various types of repli- 
cation of the latin square and meth- 
ods of analysis are discussed by 
Grant (52), Edwards (37, 38), Coch- 
ran and Cox (27), and Kempthorne 
(83). 

The simplest replicated latin- 
square design is the 2X2 square in 
which half of a group of Ss go through 
two conditions in one order while the 
remaining Ss go through the condi- 
tions in the reversed order (53). Fre- 
quently one of the conditions serves 
as an experimental control. Brogden 
(12), in a study of sensory condition- 
ing, obtained auditory thresholds 
from 10 Ss first in the presence of a 
light stimulus and then in the ab- 
sence of light, while a second group of 
10 Ss was measured in the reversed 
sequence. Threshold measures were 
analyzed as in Table 11. 

Similar designs were applied by 
Chernikoff and Brogden (21) and 
Chernikoff, Gregg, and Brogden (22) 
in studies of reaction time. In an- 
other study employing a 2 X2 design 
of the same kind for a comparison of 
recall and recognition, the investiga- 
tors inappropriately interpret the de- 
sign as a 2 X2 factorial (113). 

Replication of the same _latin- 
square design for larger squares fol- 
lows a similar pattern of analysis 
with the addition of another source 


TABLE 11 


AUDITORY THRESHOLDS WITH AND 
WITHOUT LIGHT (12) 


Experimental Group 





Threshold 
with without 
Light Light 

1 18.0 20.5 

3 20.5 18.0 


Subject* Threshold 


17 —4.5 
19 15.0 


Subgroup 


Mean 16.5 


2 20.5 
4 25.5 


18 
20 


Subgroup 
Mean 


Group 


Mean 16.0 





Analysis of Variance 





Source of Variation df 
Treatment 1 
Ordinal position of treatment 1 
Sequence of treatment 1 
Individual variation of Ss within 
sequences 18 
Error 18 


Total 39 





* The odd-numbered Ss make up the sub- 
group for which the threshold with light was 
made first and the even-numbered Ss are the 
subgroup for whom the threshold with light 
was second. 


of variation entitled ‘‘square unique- 
ness’ by Grant (52) or “‘latin-square 





LEONARD S. KOGAN 


TABLE 12 


ERROR SCORES IN LINEAR PURSUIT AS A FUNCTION OF 
ANGLE OF ARM FROM Bopy* (29) 














Order 





Sequence Subject 





IV V 





(210°) 


(270°) (300°) (330°) 





(360°) 








(210°) (240°) 


Analysis of Variance 


(270°) 


(300°) (330°) 





(360°) (180°) 








Source of Variation 
Angle 
Sequence of angles 


Ordinal position of angles 
Individual differences within sequences 


Square uniqueness 
Remainder 


Total 





* Data in body of table not provided; entries indicate sequence of angles; each score was the 


mean of 20 trials at a given angle. 


error’ by Edwards (38). The latter 
author emphasizes application of 
tests of homogeneity of variance be- 
fore pooling terms for an over-all er- 
ror estimate. A 7X7 replicated latin- 
square design employing the same 
square for several groups of Ss was 
ised by Corrigan and Brogden (29) 
in studying the effect of bodily angle 
upon precision of linear pursuit 
movements. Twenty-eight Ss were 
randomly allocated to seven groups 
of four Ss each. Each group went 
through the pursuit task at seven 
angles with varying orders of presen- 
tation in a latin-square design. Error 


scores, based on combining 20 trials 
at each angle, were analyzed as in 
Table 12. 

A similar analysis of an experi- 
mental design involving replication 
of the same 24 X 24 latin square with 
two Ss in each sequence was carried 
out by Corrigan and Brogden (30). 
Two studies by Gregg and Brogden 
(66, 67) also involved the application 
of replicated 6X6 latin squares. 

As indicated above, when the num- 
ber of experimental treatments is 
small it is possible to utilize the 
latin-square principle by providing 
for every possible permutation of 
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order of presentation. This arrange- 
ment and the method of analysis are 
discussed by Grant (52). Ryan, Cot- 
trell, and Bitterman (118) em- 
ployed such a design in a study of 
muscular tension when they assigned 
four Ss to each of the six possible 
orders of three experimental condi- 
tions (glare, noise, control), but their 
analysis of results was limited to 
treating the design as a double classi- 
fication without replication. In an 
extension of the device of using every 
permutation of experimental orders, 
Grant, Jones, and Tallantis (62), 
studying concept formation by means 
of a card-sorting experiment, em- 
ployed what might be called a double 
latin-square design since each group 
of Ss repeated their assigned order of 
three experimental treatments. Un- 
der the conditions set up by the in- 
vestigators, there were four Ss for 
each of the 24 possible permutations 
specified. 

The other major replicated latin- 
square design involves the applica- 
tion of several different randomly se- 
lected squares. No examples of this 
design other than those described by 
Grant (52) and Edwards (38) were 
found in the literature surveyed. 

Combined latin-square and factorial 
design. The latin-square principle 
may be combined with a factorial ar- 
rangement of treatments in various 
ways (37, 52, 27). One simple meth- 
od, for example, is to have a 4X4 
square, with rows representing Ss and 
columns representing successive 
trials, in which the four latin treat- 
mencs are A,B;, A,Bo, A2B;, and 
A2B2. This specific design was used 
by Prentice (115) in a study of the re- 
lation of distance to apparent size of 
figural after-effects. Four Ss weresub- 
jected to the four different treatment 
combinations on four successive days 
in a latin-square design. Each treat- 
ment was the combination of one of 


TABLE 13 
PoINTs OF, SUBJECTIVE EQUALITY FOR 
Eacu Suspject FOR DIFFERENT Dts- 
TANCES AND CONDITIONS OF 
SATIATION* (115) 





Subject —— 
1 2 3 

A 2S 6NS 2NS 

B 2NS_ 6S 2S 

c 6S 2S 6NS 
D 6NS 2NS_ 6S 


Analysis of Variance 











Source of Variation 

Days 

Subjects (sequences) 

Satiation vs. no satiation 
Distance 

Interaction: Satiation X Distance 
Remainder 


Total 





* Data not provided. The numbers in the 
body of table refer to distance (2 m. or 6 m.) at 
which S made his judgment; the letters S and 
NS refer to the conditions of “satiation” and 
‘no satiation.” 


two distances of stimulus and one of 
two conditions of “satiation.’’ Anal- 
ysis was carried out for only three 
sources of variation, viz., one df for 
the two variations of distance, seven 
(?) for individuals, and seven for er- 
ror, but might have been extended as 
shown in Table 13. 

Another variation of the principle 
of combining latin-square and fac- 
torial design was found in a study of 
delayed response performance by 
Meyer, Harlow, and -Settlage (104). 
In this experiment measures from 
sets of four monkeys were arranged 
in a 4X4 latin square, with four 
learning periods (rows) and four ex- 
perimental conditions (columns). 
Within each of the 16 cells of the 
square there were 16 entries for all 
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combinations of four types of object 
pairs and four lengths of delay. Sep- 
arate analyses were carried out for 
normal, unilateral damaged, and 
frontal damaged Ss. 

A third variation of combining 
latin-square and factorial principles 
in a replicated design was used by 
Cameron and Magaret (16) in a 
study of responses to incomplete 
sentences. In this study a 2X4 de- 
sign was employed, each subcell indi- 
cating the combination of two fac- 
tors. Replication of sequences by 
having seven Ss undergo each order 
of conditions allowed for the differen- 
tiation of sequence variation from 
variation among Ss in the same se- 
quence. 

A still more complex design involv- 
ing latin-square and factorial princi- 
ples in replication was used by Post- 
man and Bruner (114) in a study of 
the relation of set and perceptual be- 
havior. Their analysis, however, is 


difficult to explicate since composite 
sources of variation were dubiously 
partitioned and an attempt was made 
to analyze confounded interactions. 


Greco-latin square designs. A {fur- 
ther extension of the latin-square 
principle is to add another experi- 
mental treatment to the latin-square 
design in such a way that each new 
treatment appears but once with each 
Latin letter treatment (44, 45, 52). 
No ‘pure’ examples of greco-latin 
square designs were found in the 
literature surveyed. 

Comment. In a recent article 
McNemar (102) has discussed a type 
of application of the latin-square de- 
sign which has been neglected by 
psychologists. As noted above, the 
common use of the latin square has 
been the case where one classification 
of the data consists of experimental 
treatments while the other two 
classifications consist of uncontroll- 
able sources of variation, e.g., Ss and 


trials. McNemar suggests the use of 
latin square as an economical form of 
three-factor design, when each factor 
consists of the same number of levels, 
and the mixed design where two of 
the three classifications are experi- 
mental factors. An example of the 
latter case is provided by Garrett and 
Zubin (49) who describe a study of 
color recognition by the dark-adapted 
eye where the three classifications in 
a 4X4 latin square were order of pre- 
sentation (rows), levels of illumina- 
tion (columns), and color (Latin 
letters). If the rows in this study had 
represented, say, four levels of dark 
adaptation, instead of order of pre- 
sentation, this study could have 
served as an example of a three-factor 
latin-square design. 

Of perhaps more importance is 
McNemar’s contention that the latin- 
square design is rarely applicable in 
psychological research because the 
basic assumption of negligible inter- 
actions among the three classification 
variables is generally violated, espe- 
cially in the design where Ss form one 
of the criteria of classification. Mc- 
Nemar concludes that the use of the 
latin square is ‘‘defensible only in 
those rare instances when one has 
sound a priori reasons for believing 
that the interactions are zero”’ (102, 
p. 400). 

There is no question but that the 
standard mathematical model of the 
latin-square design assumes that in- 
teractions are negligible and that sta- 
tistical inference is most dependable 
when this assumption holds. But the 
writer does not agree with McNemar 
when he states that too many “‘sig- 
nificant’ F’s are obtained when this 
assumption is not met because the 
residual term, containing both inter- 
action and the ordinary error, tends 
to be smaller than the interaction 
properly used in the denominator for 
F, In the first place, it is clear that 
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the single latin square never provides 
an estimate of ‘‘pure”’ error of the type 
available when replication within the 
same subclasses is carried out. The 
residual of the latin square is always 
an admixture of confounded first- 
order and second-order interactions. 
This admixture, provided that the 
interactions are negligible, furnishes 
an unbiased estimate of ‘‘pure”’ error. 
When “significant” (but untestable 
within the design) interactions are 
present the residual could possibly be 
reduced in a specific experiment if the 
interaction(s) happened to follow the 
pattern of one or more of the three 
major classifications, but in the long 
run, if squares were always randomly 
selected, it would be expected that 
significant interaction(s) would tend 
to increase the residual, thus inflating 
the estimate of error. In such a case, 
if the experimenter were interested in 
testing the significance of a main ef- 
fect against an estimate of ‘‘pure’’ 
error (regardless of the presence of 
interactions), there would be no such 
estimate available, and his purpose 
could not be met. If, unknowingly, a 
main effect were tested against such 
an inflated estimate of ‘‘pure’’ error, 
the F ratio would tend to be too 
small. 

Let us assume, with McNemar, 
however, that the experimenter is de- 
sirous of testing the significance of 
main effects over and above the 
presence of possibly significant inter- 
actions. This would be analogous to 
testing main effects against interac- 
tion in the two-factor replicated de- 
sign. The mathematical model for 
the two-factor case is simple because 
only one interaction is present. The 
important point, however, about the 
two-factor design is that the observed 
interaction term is assumed to be 
composed of two additive compo- 
nents, interaction variance plus error 
variance, while an observed main ef- 


fect is assumed to be composed of 
three additive components, main ef- 
fect variance plus interaction vari- 
ance plus error variance. Since the 
residual term in the latin square is 
made up of several confounded inter- 
actions, it is impossible to set up a 
simple ‘‘components-of-variance” 
model (see below) as for the two-fac- 
tor design. Nevertheless, for practi- 
cal purposes one can assume that the 
observed residual of the latin square 
is composed of two components: con- 
founded interactions variance plus er- 
ror variance. Each observed main 
effect would then be assumed to con- 
sist of three components: main effect 
variance plus confounded interac- 
tions variance plus error variance. 
The consequent F ratio of main effect 
mean square over residual mean 
square should then tend to give an 
unbiased test of the significance of 
main effect over and above the pres- 
ence of significant interactions. Mc- 
Nemar's contention that the denomi- 
nator of the F test should properly 
consist of interaction alone, i.e., sepa- 
rated from error variance, has, so far 
as the writer is aware, no precedent 
in analysis of variance methods. In 
any case, however, it is clear that the 
presence of significant interactions 
negates the application of the usual 
latin-square design. No mathemati- 
cal justification is readily available 
for inference in the case where some 
of the interactions are “significant.” 


DESIGNS INVOLVING ANALYSIS 
OF COVARIANCE 


In some investigations designed for 
analysis of variance it may not be 
feasible to control or classify the data 
on the basis of one or more relevant 
variables which can, however, be 
measured. The addition of covari- 
ance analysis (44) to the experimen- 
tal design allows for adjustments to 
be made in experimental comparisons 
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on the basis of the regressions of the 
variable of primary importance on 
these other relevant variates. Covari- 
ance analysis may be carried out for 
all of the experimental designs so far 
presented. Discussions of covariance 
are readily available both for single- 
classification designs (36, 37, 76, 95, 
101, 124) and for multiple-classifica- 
tion designs (37, 76, 95, 124), as well 
as for the case of one independent or 
control variable (36, 37, 76, 95, 101, 
124) or two independent variables 
(76, 124). Snedecor (124) also pro- 
vides an example of covariance in a 
latin-square design. 

Applications of covariance analy- 
sis in experimental design were not 
too common in the literature sur- 
veyed. In general, moreover, when 
covariance was used little descriptive 
detail was provided. Bernberg (9) 
adjusted error scores for three groups 
of rats, learning a maze under three 
different conditions, on the basis of 
differentia! food intake in a single- 
classification design. In a study of 
reminiscence, Buxton and Bakan (15) 
adjusted criterion scores based on dif- 
ferences between rest and no-rest 
conditions by “‘correction”’ for recall 
trial difference scores. Buxton and 
Ross (14) similarly applied covari- 
ance analysis to a two-factor design 
in a study of the relationship between 
reminiscence and type of learning 
technique.* Reynolds (117), in a 
study of resistance to extinction, con- 
sidered covariance adjustments of 
learning scores on the basis of scores 
on a previously trained habit, but re- 
jected the plan because of hetero- 
geneity of variances and low correla- 


4 The main rationale for using covariance in 
this study was to ‘‘remove”’ variance due to 
using the same Ss under experimental and con- 
trol conditions. Grant (52) illustrates how this 
same study might have been analyzed by in- 
terpreting the arrangement as a 2X2 greco- 
latin square, 


tions. He then adjusted for training 
time in an analysis of the extinction 
scores. Glixman (51) applied covari- 
ance analysis. to a 3X3 X2 factorial 
design in a study of recall of com- 
pleted-incompleted tasks under dif- 
fering conditions of stress. Covari- 
ance adjustments were made for 
scores based on number of incom- 
pleted-recalled tasks in terms of total 
number of incompleted tasks. 

The complex problem of handling 
disproportionate subclass frequencies 
in a double-classification 24 fac- 
torial design with covariance analysis 
is exemplified in a study by Fitch, 
Drucker, and Norton (46), who used 
a procedure developed by Tsao (128). 
A rather full explanation of design, 
basic assumptions, and analytic pro- 
cedures is provided by this study. 

Comment. As noted by Fisher (45) 
and others, analysis of covariance in 
experimental design may be used for 
two major purposes: (a) to increase 


the precision of experimental com- 
parisons by statistically controlling 
for sources of variation which do not 
lend themselves to experimental con- 
trol, and (0) to aid in the interpreta- 
tion of the results of an experiment. 
In the former case the experimenter 


should be sure that the ‘‘control”’ 
variable is independent of treatment 
effects (45). Ordinarily he is not par- 
ticularly interested in studying the 
relationship between the concomitant 
measures and the primary variable 
being investigated (95). The stand- 
ard example of a supplementary vari- 
able which can frequently be em- 
ployed to improve the precision of an 
experiment is pretest scores on the 
same kind of performance which is to 
be measured during the experiment 
itself. 

More care must be taken in decid- 
ing to utilize analysis of covariance 
for the second purpose. In this case 
supplementary measures are ordi- 
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narily taken during the course of the 
experiment and hence variations in 
the concomitant variable may be a 
function of the experimental treat- 
ments. The obvious difficulty in ap- 
plying analysis of covariance here is 
that adjustment of the primary vari- 
able may remove part of the treat- 
ment effect itself. The experimenter 
may wish, however, to find out 
whether there are significant differ- 
ences in treatment effects on the 
primary variable when the secondary 
variable is ‘equalized’ over all 
groups. In such cases it is generally 
profitable to carry out not only an 
analysis of covariance to “eliminate” 
possible effects of the secondary vari- 
able, but also separate analyses of 
variance of both the secondary vari- 
able and the unadjusted primary 


measures, as well as careful examina- 
tion of regression and correlation co- 
efficients. Comparison of the several 
analyses will tend to clarify the ex- 


tent to which experimental effects on 
the primary variable act directly or 
indirectly through the mediation of 
the concomitant variable or covari- 
ate. For example, to paraphrase 
Snedecor (124, p. 335), in the study 
by Glixman (51) cited above: Did the 
Ss show differences in number of in- 
completed-recalled tasks under vary- 
ing conditions of stress because of dif- 
ferences in total number of incom- 
pleted tasks, or in spite of them? 
Excellent discussions of the use of 
analysis of covariance to improve 
understanding of experimental struc- 
ture are presented by Edwards (37) 
and others (27, 83, 124). 

In addition to the complexities of 
procedure and interpretation which 
generally arise when analysis of co- 
variance is applied to designs involv- 
ing multiple classification, or when 
there is more than one concomitant 
variable, other difficulties sometimes 
arise in the use of covariance analy- 


sis. At times the regression of pri- 
mary variable upon supplementary 
variable may be nonlinear, necessi- 
tating the adjustment of primary 
variable on the basis of curvilinear 
regression (80). In other cases the ex- 
perimenter may discover that there 
are problems in the choice of appro- 
priate regression coefficients for esti- 
mation of the main variable because 
of marked heterogeneity of regression 
from subclass to subclass. Jackson 
(73, 74) provides a detailed discussion 
of such problems and others and sug- 
gests possible solutions. 

In the context of analysis of co- 
variance, special mention should be 
made of the Johnson-Neyman tech- 
nique (78, 81). As noted above, one 
of the major uses of covariance meth- 
ods is to adjust experimental com- 
parisons for extraneous causes of 
variation. Frequently, such adjust- 
ment is designed to “equate” experi- 
mental groups when it is not feasible 
to increase precision by pairing cases 
or otherwise matching the several 
groups with respect to the relevant 
measures. The Johnson-Neyman 
technique not only furnishes a test of 
whether a statistically significant dif- 
ference exists between the means of 
the groups being compared but, in 
addition, specifies the range of con- 
trol variables for which a conclusion 
of significant difference may be re- 
garded to hold. Moreover, no special 
difficulties are involved when the 
groups are unequal in number. John- 
son and Fay (81) provide the detailed 
computational and graphical solution 
for a problem in which the social 
studies achievement of 90 pupils who 
excel in the ability to predict the out- 
come of given events is compared 
with the social studies achievement 
of 90 pupils who are poor predictors. 
The null hypothesis rejected on the 
basis of the analysis was that no dif- 
ference exists in mean achievement 
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between superior and inferior predic- 
tors when the effects of chronological 
and mental ages are controlled. The 
unique surplus information  con- 
tributed by the Johnson-Neyman 
technique indicated the range of 
mental age and chronological age for 
which the conclusion of significant 
difference was valid. 


GENERAL CONSIDERATIONS 


Models, assumptions, and trans- 
formations in analysts of variance. Be- 
fore valid inferences may be drawn 
from an analysis of variance, the data 
must reasonably satisfy certain as- 
sumptions made about the underly- 
ing mathematical models used in the 
analysis and subsequent tests of sig- 
nificance. In recent years various sets 
of assumptions have been proposed 
about the elements in the linear 
models whereby analysis of variance 
is used for statistical inference. 
Pointing out that Fisher (44) had 


originally introduced the twofold con- 


ception, Eisenhart (41), in 1947, 
elaborated on the viewpoint that 
analysis of variance involves one of 
two basic models, each appropriate 
for the solution of a different class of 
problems: Model I to detect or esti- 
mate fixed relations among popula- 
tion means, and Model II to detect or 
estimate components of random vari- 
ation ascribable to the different fac- 
tors being investigated. The former 
is frequently referred to as the stand- 
ard model while the latter is com- 
monly called the components-of-vari- 
ance model (105). 

In brief the major distinction be- 
tween the two models is that Model 
I assumes that treatment and other 
designated effects are additive fixed 
constants, introducing systematic 
variation, while Model II assumes 
that treatment and other effects are 
random variables each having a nor- 
mal distribution. Both models as- 


sume that experimental (residual) er- 
rors are independently and normally 
distributed with a constant variance. 
The decision as to whether a given 
element in the linear model is best 
represented by a mean, indicating a 
systematic source of variation, or by 
a variance, indicating a random 
source ef variation, depends upon the 
extent to which the respective vari- 
able was randomly sampled. — In 
many experiments some of the effects 
are best regarded as fixed, e.g., the 
usual case for experimental treat- 
ments which are rarely randomly 
drawn from a population of possible 
treatments, while other effects may 
be regarded as introducing random 
variation, e.g., effects assignable to Ss 
drawn at random from a specified 
population. When both types of ele- 
ments are present, the underlying 
model is described as ‘‘mixed.’’ 

The majority of published psycho- 
logical studies employing variance 
designs give little evidence that in- 
vestigators pay much attention to the 
several assumptions underlying anal- 
ysis of variance. For instance, al- 
though tests of homogeneity of sub- 
group variance are readily available, 
e.g., Bartlett’s test (37, 76, 84, 124), 
the L,; test (76, 84), the M test (76, 
84), and Box’s test (36), the assump- 
tion that experimental errors have 


6’ A recent review by Crump (31) indicates 
that Eisenhart’s so-called Model I, Model II, 
and Mixed Model have been supplemented by 
Tukey with Models IIT, IV, V, and X, all in- 
volving somewhat different assumptions. 
Many of the analyses described by Kemp- 
thorne (83) are based upon finite ‘‘randomiza- 
tion” models, involving no assumption about 
normality of error distributions. Other models 
have been proposed which are also nonpara- 
metric, i.e., make no assumption about the 
form of the population distributions (105, 
106). In general, the assumptions involved in 
all of these models are less restrictive than the 
usual set of assumptions, thus broadening the 
potential applicability of analysis of variance 
techniques. 
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equal variance appears to be tested 
only in a minority of experiments. 
The assumptions that errors are un- 
correlated, with constant variance, 
appear on both empirical and theo- 
retical grounds to be somewhat more 
critical than the assumption of nor- 
mality of distribution of the errors 
(25, 27, 83). 

Cochran, in a detailed discussion 
of the consequences when the as- 
sumptions for analysis of variance as 
a technique for carrying out tests of 
significance of differences among 
means are not satisfied, states that 
“the principal methods for an im- 
proved analysis are omission of cer- 
tain observations, treatments, or 
replicates, subdivision of the error 
variance, and transformation to an- 
other scale before analysis’ (25, p. 
37). The method most frequently 


resorted to by psychologists who take 
cognizance of violation of assump- 


tions is transformation of the scale. 
The rationale and conditions for vari- 
ous kinds of transformations are most 
fully discussed by Bartlett (5). 
Briefer accounts are available in 
other sources (37, 76, 83, 124). Such 
transformations are frequently in- 
tended to stabilize error variance, 
especially in cases where variances 
within the subclasses show a func- 
tional relationship with subclass 
means. 

In the experimental literature, 
Haggard (68, 69) provides a careful 
investigation of the problem of select- 
ing proper measures of GSR data for 
analysis of variance procedures and 
the effects of using inappropriate 
measures. Among the specific trans- 
formations that were utilized in re- 
cent psychological studies were 
square-root transformation of num- 
ber of reversals of perspective (13), 
log transformation of latency scores 
(4, 93), log transformation of hoard- 
ing scores (72), log transformation of 


number of contacts in a _ pursuit- 
meter task (30), reciprocal transfor- 
mation of latency scores (89), arcsine 
transformation of percentages (75, 
94, 109), and transformation of ob- 
tained scores to per cent of prestimu- 
lus values (35). 

The F test in analysis of variance. 
In the single-classification design the 
error term of the F ratio is provided 
by the “‘within-groups’’ mean square. 
Similarly for the double-classification 
design with a single observation in 
each cell, the denominator of the F 
ratio is furnished by the interaction 
or remainder mean square. Com- 
plications arise, however, in selecting 
the proper denominators for the F 
ratios in the case of the double- 
classification design with several ob- 
servations in each subclass. In gen- 
eral, the most widely used procedure 
has depended upon whether or not 
the investigator decides that the in- 
teraction F test is “‘significant.”” If 
the interaction F test is not found to 
be significant, the investigator may 
either use the ‘“within-cells’’ mean 
square as his error estimate in test- 
ing the significance of variation 
among main effects or pool the ‘‘in- 
teraction” and “within-cells’’ SS in 
arriving at an error estimate. If, 
however, he finds that the interaction 
mean square is significant, he gen- 
erally employs the latter square in 
his F tests for the main criteria of 
classification. Strictly speaking, the 
use of the interaction mean square as 
an estimate of error in this case in- 
volves application of a ‘‘components- 
of-variance’’ model to the data, since 
it is thereby assumed that the ex- 
pected value of the mean square for 
an apparent main effect is made up of 
a linear combination of error vari- 
ance, interaction variance, and the 
intrinsic main effect variance itself. 
The hypothesis is thus being tested 
that the intrinsic main effect is not 
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significant over and above any varia- 
tion attributable to both random 
error and the effect of interaction. 
With higher order multiple-classi- 
fication or factorial designs, the selec- 
tion of appropriate F tests becomes 
more complex. Again, psychologists 
have generally proceeded “from the 
bottom up” in setting up F ratios 
from an analysis of variance table. 
For example, if a three-factor design 
with several replications per sub- 
class is involved, the first test is gen- 
erally made by setting up the F ratio 
of the highest order interaction mean 
square over the “within-cells’’ mean 
square. If this highest order inter- 
action is found to be “not signifi- 
cant,” it may then be pooled with the 
“within-cells” term to provide the 
denominator for F tests of the next 
highest interaction terms. On the 
other hand, if the highest order inter- 
action is found to be “significant,” 
the F tests of the next highest inter- 


action mean squares are made with 
the highest order interaction mean 


square in the denominators. When 
the investigator arrives at the main 
variables of classification he generally 
has used as his denominator for F 
ratios the pooled interactions and re- 
sidual, if none of the preceding F's 
has been significant, or the interac- 
tion mean square of largest magni- 
tude which contains the elements as- 
sumed to be contributing to the ap- 
parent variability of a given main 
variable. This ad hoc, somewhat in- 
tuitive procedure for arriving at F 
ratios may frequently be criticized 
from the standpoint of the variance 
components assumed to be operating 
in the specific situation or because 
the elements assumed to be random 
variables in the linear model may 
more logically be assumed to repre- 
sent fixed parameters. If, however, 
a “components-of-variance’’ model 
(41) is justified by the data and sam- 


pling methods used in the study, a 
more appropriate method of testing 
relations is available. Among others 
(31, 83, 84, 105), Cochran (26) has 
recently discussed in detail the prob- 
lems arising in testing a null hypothe- 
sis about several means when an ap- 
propriate denominator for an F ratio 
is not immediately provided by the 
“expected mean squares’ in the 
analysis of variance table. The pro- 
cedure suggested involves setting up 
what Cochran calls an F’ test where 
numerator and denominator of the 
F ratio are linear combinations of 
mean-square terms arranged in such 
a way that the treatment effect to be 
tested is present only in the numera- 
tor, while all remaining assumed 
components of variance are present 
in both the numerator and denomina- 
tor. The respective df’s for the com- 
posite ratio are determined according 
to approximations proposed by Sat- 
terthwaite (119, 120). The formula- 
tion of such ratios is facilitated by 
the provision of expected mean 
squares for many commonly used ex- 
perimental designs by Snedecor (124) 
and Mood (105). 

Individual tests of significance in the 
analysis of variance. Investigators 
frequently desire to follow an over-all 
analysis of variance with tests of the 
significances of differences between 
individual pairs or groups of treat- 
ment means. When the F associated 
with a given classification is found to 
be significant, the most commonly 
used procedure has been the method 
described by Lindquist (95), in which 
t tests are applied to the selected 
means, using standard errors of dif- 
ferences based on the appropriate er- 
ror variance from the analysis of vari- 
ance. Confidence intervals can be set 
up on the same basis. When the F 
ratio representing a given classifica- 
tion is found to be not significant, the 
investigator generally ceases his anal- 
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ysis. In recent years this simple al- 
ternative operation has been criti- 
cized and further extensions of analy- 
sis of variance have been suggested. 
Dixon and Massey (36) propose a 
“test for extreme mean” applicable 
in the situation where one group of 
Ss is a control group, while the re- 
maining groups are experimental 
groups. Johnson (76), utilizing a sug- 
gestion made by Fisher (45), dis- 
cusses how selected pairs of means 
may be compared by lowering the p 
level for significance in accordance 
with the possible number of compari- 
sons. Snedecor (124) and Cochran 
and Cox (27) warn about the dangers 
of testing differences suggested by 
the data and present methods for 
subdividing the treatment and error 
SS for relevant individual and group 
comparisons. 

Subdivision of treatment SS is es- 
pecially applicable in experiments 
where the levels of a given factor rep- 
resent varying amounts or categories 
along a treatment continuum, e.g., 
degrees of learning. Kelman (82), for 
example, had four groups of Ss (Con- 
trol, Success, Failure, Ambiguous) 
which furnished three orthogonal 
comparisons: (a) Control group vs. 
experimental groups (3C-S-F-A); (0) 
Ambiguous reinforcement vs. clear- 
cut Success or Failure (2A-S-F); and 
(c) Success vs. Failure (S-F). John- 
son and Tsao (79), in a 4X7XK2X2 
<2 factorial study dealing with the 
determination of differential limen 
values, furnish a detailed discussion 
of the application of orthogonal 
polynomials (44) in expressing the 
relationships between the factors, 
e.g., weight, rate, etc., and the limen 
values. The procedure of fitting 
orthogonal polynomials to a given 
factorial classification with associated 
tests of significance for linear regres- 
sion, parabolic regression, etc. can 
frequently be used to furnish infor- 


mation and answers to questions 
which are not supplied by the over- 
all F test for a given set of treatment 
means (27, 124). 

Perhaps the most simple and prac- 
tical procedure for comparing indi- 
vidual means now available to the 
investigator who is not satisfied with 
the results of an over-all F test is 
that presented by Tukey (129). In 
this procedure, after finding a signifi- 
cant F for a set of treatment means, 
one successively applies a “‘gap”’ test, 
a ‘‘straggler’’ test, and a new F test to 
subgroups among the treatment 
means to detect distinguishable 
groups. 

The power of analysis of variance 
tests. Almost universally, the psy- 
chologist has limited his attention in 
testing hypotheses to consideration of 
errors of the first kind (Type I errors), 
i.e., rejecting hypotheses when they 
are true. The risk of committing 
errors of the second kind (Type II er- 
rors), i.e., accepting false hypotheses, 
has in general entered very little into 
his schema of statistical inference. 
The usual test of significance involves 
the specification of a so-called critical 
region which controls only the risk of 
error of the first kind. Thus, for ex- 
ample, if the critical region is set at 
the .05 level for an F test, the experi- 
menter is in effect declaring that re- 
jection of the null hypothesis when p 
equals or exceeds this value will be 
wrong only 5 per cent of the time. 

What happens, however, when a 
null hypothesis is not rejected, i.e., 
if the F ratio is smaller than, say, the 
tabulated value for the .05 level? 
Most investigators appreciate in the- 
ory that such a finding does not mean 
that the null hypothesis is proved. 
And yet the tendency is strong to ac- 
cept the null hypothesis and draw the 
conclusion that no differences among 
means are present. The power func- 
tion of a given test of significance is 
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designed to indicate the probabilities 
of rejecting a specified hypothesis 
when alternative hypotheses are as- 
sumed to be true. In the usual case 
where the specified hypothesis is a 
null hypothesis, i.e., all of a group of 
means are equal, the probability of 
rejecting the null hypothesis when 
the true means are in fact different 
depends upon the significance level 
selected for the test, the magnitude 
of the differences among the means, 
the size of the error variance, and the 
number of replicates. 

Two major approaches have been 
devised for determining the power of 
analysis of variance tests. In the 
older method developed by Tang 
(126), the alternative hypothesis to 
the null hypothesis is expressed in 
terms of the variance of a finite set of 
assumed population means equal in 
number to the number of observed 
means involved in the F test. Ina 
more recent approach described by 


Ferris, Grubbs, and Weaver (43), the 
alternative hypothesis is expressed in 
terms of a set of normally distributed 
population means, these means rep- 
resenting a sample from a normal 
superpopulation with variance bear- 
ing a specified ratio to the error vari- 


ance involved in the F test. Since, 
however, this latter paper presents a 
somewhat limited set of curves for 
estimating the power of the analysis 
of variance test at only the .05 sig- 
nificance level, further discussion will 
be limited to the Tang approach.® 
The method developed by Tang 
assumes that the observations can be 
expressed in terms of Model I (see 


* Actually the paper of Ferris, Grubbs, and 
Weaver (43) presents ‘‘operating-characteris- 
tics” curves which are related to power curves 
as x is to 1 —x, i.e., complementary. Eisenhart 
et al. (42) also provide a brief discussion of 
operating-characteristic functions for analysis 
of variance tests based upon Eisenhart's 
Model IT. 


above), which assumes a linear com- 
bination of mean effects and errors 
which are normally and independ- 
ently distributed with constant vari- 
ance. Tang presents fairly extensive 
tables for varying pairs of df, under 
the assumption that either a .05 or 
.01 level of significance is being em- 
ployed for rejection of the null hy- 
pothesis. In these tables the prob- 
abilities of error of the second kind 
are indicated for varying sizes of @, a 
variance ratio with numerator de- 
rived from the assumed alternative 
hypothesis. Lehmer (90) subse- 
quently prepared tables providing 
the value of ¢ required for a specified 
probability of error of the second 
kind. Tang’s tables are reproduced 
with extensive discussion of their use 
in Kempthorne (83) and Mann (99), 
while Lehmer’s tables for Type II er- 
rors of probability .3 and .2 are avail- 
able in Dixon and Massey (36). 
The power function of analysis of 
variance tests is very useful (a) for 
estimating the sample sizes that will 
reasonably guarantee a desired prob- 
ability of error of the second kind for 
a specified alternative hypothesis and 
a designated level of significance, and 
(6) for determining the power of a 
test, if the sample sizes and signifi- 
cance level have already been fixed. 
Perhaps the chief implication of the 
power concept in relation to current 
psychological research is the conclu- 
sion that many experiments are car- 
ried out without sufficient replication 
to insure a reasonable chance of de- 
tecting experimentally important dif- 
ferences in treatinent means. Kemp- 
thorne (83), for example, estimates 
that six replicates in each subclass are 
necessary in a 2X2 X2 factorial ex- 
periment to insure with a probability 
of .95 that a true difference of the two 
means for a given factor equal in size 
to the error standard deviation will 
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be detected (using a .05 test of sig- 
nificance). 

The power conception is somewhat 
contrary to the commonly accepted 
notion that an experimenter should 
insist on a very stringent level of sig- 
nificance before rejecting a null hy- 
pothesis when his samples are rela- 
tively small. Such a notion, para- 
doxically defended on the grounds of 
conservatism, has probably resulted 
in the premature dismissal of many 
potentially important areas of experi- 
mentation. It might be a worth- 
while addendum to current methodol- 
ogy if many exploratory, small- 
sample experiments were primarily 
devoted, not to tests of hypotheses, 
but to obtaining estimates of error 
variance. Once such estimates of er- 
ror variance have been obtained, the 
experimenter is in a position to de- 
termine the sample sizes necessary to 
detect differences between treat- 
ments regarded to be of practical or 


theoretical significance. 
An overview of psychological re- 


search design and analysis. Because 
of the widely accepted thesis that 
experimental design and statistical 
analysis are ‘“‘dynamic’”’ aspects of the 
same research ‘‘whole,” and inas- 
much as many statistical and meas- 
urement techniques, e.g., regression 
and correlational analysis, the ¢ test, 
chi square, discriminant functions, 
etc., can be subsumed under analysis 
of variance and the F distribution, a 
thorough survey of variance designs 
used by psychologists could have 
been extended far beyond the limits 
of the present article. Minimum at- 
tention, moreover, has been given to 
the general theory of experimental 
design in this survey. Basic concepts 
such as experimental control, statisti- 
cal control, randomization, replica- 
tion, balance, efficiency, precision, 
orthogonality, | comprehensiveness, 


self-containedness, etc. enter in the 
adoption of any specific experimental 
design, but, in general, the journal re- 
porting of studies is not amenable to 
elaboration of underlying principles. 
With regard to specific experimen- 
tal arrangement, cursory examination 
of books which have treated experi- 
mental design in agricultural, bio- 
logical, or industrial research, from 
Fisher's classic (45) to the recent 
comprehensive presentations by 
Cochran and Cox (27) and Kemp- 
thorne (83), reveals that psycholo- 
gists have generally utilized only the 
simpler and ‘‘complete”’ experimental 
configurations. Of the 150 experi- 
mental plans presented by Cochran 
and Cox (27), only a small minority 
seem to have appeared in psychologi- 
cal research design. As noted previ- 
ously, various devices of deliberate 
confounding or partial confounding, 
especially applicable in higher-order 
factorial studies, seem to be only 
rarely considered, despite their early 
introduction into the psychological 
literature by Baxter (6) and their po- 
tential experimental and _ practical 
advantages. This is not to say that 
methodology from one area of re- 
search can be routinely applied in 
another area, but the increasing fre- 
quency with which variance designs 
have been applied in psychological 
research probably indicates a trend 
which can be expected to continue. 
The practicing researcher, of 
course, finds difficulties in keeping up 
with current developments and re- 
finements in the area of experimental 
design and analysis. A real service is 
being performed by the excellent 
summaries being presented in the 
Annual Review of Psychology (39, 54, 
103). Nevertheless, as pointed out by 
Johnson (77) in a recent discussion of 
the contribution of statistical science 
to educational and psychological re- 
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search, the ‘‘newer developments in 
the field have mainly been specialized 
devices for specialized purposes’”’ with 
three basic principles of experimen- 
tation—replication, randomization, 
and control of variability—being the 
foundation stones of modern experi- 
mental design. Current psychological 
research, as we have seen, has been 
tremendously influenced by the 
‘“Fisherian revolution in methods of 
experimentation” (133). 


SUMMARY 


This article has presented a survey 


of the major types of experimental 
design involving analysis of variance 
which have characterized psycho- 
logical research during recent years. 
The survey is implemented by brief 
reference to specific studies utilizing 
a variety of experimental configura- 
tions which have appeared in the 
literature. Some comments were 
made about the appropriateness of 
design or analysis in particular in- 
stances, followed by a discussion of 
general considerations in application 
of variance design and analysis. 
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It is the intent of this paper, first, 
to present a theory—largely derived 
from Schachtel (77) and Rickers- 
Ovsiankina (74)—and second, to ex- 
amine a number of studies involving 
color in an effort to substantiate and 
clarify the theory. The reader will 
soon observe that the paper is heavily 
weighted with material from Ror- 
schach inkblot studies. This weight- 
ing is explained by two somewhat 
mutually dependent factors: color, 
as well as a number of other variables, 
has been systematically treated in 
the Rorschach test, and there have 
been a multitude of papers written in 
which the Rorschach test was fea- 
tured. Whenever possible, parallel 
studies which use other techniques 
employing color will be introduced. 


DERIVATION OF A THEORY 
Schachtel 


Schachtel feels that the experience 
of color and the experience of affect 
have two important characteristics in 


common: “...the passivity of the 
subject, and the immediacy of the re- 
lation object-subject”’ (77, p. 399). 


1 This paper represents a slight modification 
of the first chapter of a Ph.D. dissertation, A 
Study of the Relation of the Response to Color 
and Some Personality Functiors, submitted in 
partial fulfillment of the requirements for the 
Degree of Doctor of Philosophy, Western Re- 
serve University, 1952. The writer gives his 
sincere thanks to Dr. Calvin S. Hall whose 
astute and acute criticisms contributed much 
to the formulation of this paper. 

2 At the time this paper was submitted, the 
writer was a USPHS postdoctoral research 
fellow at the Roscoe B. Jackson Memorial 
Laboratory. At time of publication, the 
writer is a postdoctoral: training fellow in 
Clinical psychology at the VA _ Hospital, 
Northampton, Massachusetts. 


Two examples from Schachtel may 
clarify this analogy. 

1. An individual enters a room in 
which there are two designs. On one wall 
is a large blob of color. On the opposite 
wall is a large design in black and white. 
The blob of color is immediately per- 
ceived, almost without conscious atten- 
tion. The individual is aware only of 
color. The design in black and white re- 
quires directed attention before it can be 
perceived. 

2. An individual becomes angered. He 
strikes out blindly at his antagonist, with- 
out regard for the consequences of his act. 
He is aware only of his anger and an ob- 
ject upon which to vent this anger. 


The two examples are extremes, 
One may imagine more moderate be- 
haviors in the two situations de- 
scribed above. An individual enter- 
ing the room immediately perceives 
the blob of color, but his perception 
also encompasses the contour, and 
some analysis of its shape occurs. An 
individual becomes extremely angry 
but acts upon the situation in such a 
manner that the tension produced by 
the anger is reduced without violence 
being done either to the stimulus of 
the anger or to the individual himself. 

It is a primary task of the ego to 
control and direct affective reactions, 
whether produced Ly drives originat- 
ing from without the ego or from 
within it (40). The individuals in the 
two situations described by Schachtel 
were completely passive. They were 
literally swept away by their experi- 
ences. Their egos, which should have 
channeled and controlled the aroused 
experiences, failed in their tasks. The 
behaviors of the individuals in the 
situations described by the writer 
may have retained some elements of 


41 
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passivity, but this passivity was con- 
siderably moderated by ego control. 
It is evident that passivity does not 
refer to the overt behavior of the indi- 
vidual, but only to the relation be- 
tween his affective drives and his ego. 
Schachtel points out that the affec- 
tive experience is a conscious one, re- 
gardless of the passivity of the ego. 
Where there are no affects, there is no 
consciousness of drives. 
Rickers-Ovsiankina 
Rickers-Ovsiankina, after review- 
ing the literature (74), concludes that 
an individual's response to color can 
give considerable insight into the de- 
gree of permeability of his ego. That 
individual whose ego is responsive to 
the outside world will respond to 
color. As the degree of permeability 
increases—that is, as the boundary 
between the ego and the outside 
world lessens in strength—the indi- 
vidual will respond more to color per 
se 


Both Schachtel and Rickers-Ovsi- 
ankina stress the fact that the extra- 
tensive individual is responding to 
outside stimuli and that there is a 
lessening in the spontaneity of the in- 


dividual. However, one misses an 
important point if he concentrates 
upon this particular elaboration 
in Rickers-Ovsiankina’s _ paper. 
Schachtel feels that the individual 
who responds to color per se not only 
possesses an ego which readily re- 
sponds to the outside environment 
but which also is less capable of exert- 
ing control upen affective drives 
having internal origin. He feels then 
that the permeability of the ego is a 
two-way affair, for there is also a 
more direct release of affective drive 
upon the external environment. 


The Concept of the Egocentric 
Individual 


Let us diverge for a moment to 


consider another aspect of the indi- 
vidual who responds to color per se. 
Such a divergence will serve the dual 
purpose of clarifying the points pre- 
sented thus far, and of facilitating 
interpretation of certain of the stud- 
ies which will be presented later. 

The individual who responds to 
color per se has been called egocentric 
by a number of authors (11, 16, 41, 
42, 54, 68). In the light of the pre- 
ceding discussion, just what does 
“egocentricity’’ mean? Warren (88) 
defines the term “egocentric”’ as fol- 
lows: ‘‘disposed to dwell on oneself 
and to view every situation from a 
personal angle’ (88, p. 89). Asa 
synonym, he gives the term “‘self- 
centered.” Do Schachtel and Rick- 
ers-Ovsiankina actually consider the 
individual who responds to relatively 
undifferentiated color ‘egocentric’? 
Under Schachtel’s scheme the indi- 
vidual adopting this mode of color 
response may behave in one of two 
ways, or both. 

The individual will adapt to his en- 
vironment, behaving as it dictates. 
If an environmental configuration di- 
rects action in one way, he will act in 
that way. If action is directed an- 
other way, then the individual will 
again modify his action to conform 
to environmental pressure. This in- 
dividual clearly cannot be called 
“‘self-centered,”’ for he is most likely 
to perform in the manner in which 
others wish him to perform. But 
Schachtel further feels that the indi- 
vidual responding to relatively un- 
differentiated color may also release 
his affective drives in a relatively un- 
differentiated manner upon the en- 
vironment without regard for that 
environment. It is this latter be- 
havior which might best be called 
“egocentric.’’ Nevertheless, Schach- 
tel makes it clear that an individual 
who behaves in this manner is not 
channeling his affective drives. There 
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is a more or less direct interchange 
between his affective drives and the 
environment. The individual’s be- 
havior, therefore, rather than being 
egocentric, is relatively removed from 
the control of the ego. The ego plays 
a relatively unimportant role as to 
the object upon which the affective 
charge is released and the manner of 
its release. It is certainly far from a 
deliberate (that is, ego in origin) at- 
tempt to ignore the feelings of others 
that results in the sometimes anti- 
social and inconsiderate behavior of 
this individual. It is the result of a 
fundamental incapacity of the ego to 
direct and control the affective charge 
in a realistic manner. 


STATEMENT OF A THEORY 


The experience of affect and the 
experience of color are quite compar- 
able. Thus one may examine the less 
obvious of the two, affect, by the re- 
sponse to the more obvious, color. 


The experience of affect is passive. 
The degree of passivity is determined 
by the degree of control exerted upon 
the affective charge by the ego. That 
individual who responds to relatively 
undifferentiated color possesses an 
ego which is less able to control and 
channel affective charge. Such an in- 
dividual lacks spontaneity of action 
and readily adopts the color of his en- 
vironment. It is a logical corollary 
that an individual responding to 
relatively undifferentiated color may 
release affective charge upon the en- 
vironment in a more or less un- 
differentiated manner. Further, his 
perception of affective charge in 
others may also be relatively undif- 
ferentiated. That is, the individual 
may sense very acutely the presence 
of affect in others without being able 
to differentiate it, to identify its na- 
ture. Thus, when someone becomes 
angry with him, he may only be 
aware of the existence of a powerful 


affective state in his antagonist. He 
may not know what is the nature of 
this state. However, the particular 
course of action followed by the indi- 
vidual depends upon his total per- 
sonality configuration.® 


THE STUDIES 


A theory, no matter what its logi- 
cal integrity, must be tested by data 
set forth in a variety of studies. 
Rickers-Ovsiankina in her paper (74) 
to which reference is made above, re- 
views a number of articles and the 
present writer does not intend to 
duplicate her bibliography to any 
great extent. The studies included 
here vary from those dealing with 
normal individuals and their develop- 
ment through those concerned with 
individuals suffering from organic 
brain damage. 

Since a number of studies to which 
reference will be made later concern 
the Rorschach test, brief mention 
will be made of the treatment of color 
by users of the test. Those desiring a 
comprehensive exposition of the 
treatment of color by Rorschach in- 
vestigators may refer to any of the 
standard texts (11, 12, 16, 54, etc.) or 
to the normative studies of Hertz (41, 
42). Suffice it to say that the scoring 
of a response to color depends upon 
the degree of structure imparted to 


* The reader who refers to the original 
manuscripts by Schachtel and Rickers- 
Ovsiankina will see that for the most part the 
statement presented here is simply a more 
succinct and perhaps clearer presentation of 
some of the ideas formulated in the two 
papers. The suggestion that a blunting of the 
perception of affective drives in others ac- 
companies the response to relatively undif- 
ferentiated color may be considered the most 
important addition. The writer feels that at 
this stage of knowledge, little may be gained 
from further additions to the theory, but that 
the theory and the manner in which it is used 
in the present paper can serve to promote 
more rigorous investigations, in that way con- 
tributing to an advancement of the theory. 
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the color or the degree of integration 
of color with form. An undifferenti- 
ated or unstructured color response is 
one determined solely by the color of 
the blot. This is scored as C. A re- 
sponse which is determined princi- 
pally by the color but which is struc- 
tured to some extent is scored as CF. 
A résponse in which the color is quite 
integrated with form is scored as FC. 
The nature of the affective experience 
which is represented by each of these 
scoring categories is apparent from 
what has been said above. The three 
color factors have been assigned nu- 
merical weights as follows: FC, .05; 
CF, 1.0; C, 1.5. These weights were 
originally suggested by Rorschach, 
and have been used extensively. 
Rorschach’s belief was that, since C 
represented a more powerful and un- 
controlled affective drive, it should 
have the greatest weighting and FC 
the least. Weighting finds applica- 
bility particularly in determining the 
so called stability or control ratio, 
calculated by the formula 


FC—(CF+C). 


The Normal Picture 


The criteria for normalcy may be 
statistical, may be based upon psy- 
chological theory and knowledge of 
dynamics, or may be philosophical. 
It is not the purpose of this paper to 
delve deeply into the dynamics or 
total configuration of any group. 
Therefore, for the criteria of nor- 
malcy (in relation to the affective as- 
pect), the following should suffice. 

First, it seems logical that in our 
society an individual must have af- 
fective drives and affective relation- 
ships with others, and must be rela- 
tively in control of these drives and 
relationships. That is, the direction 
and manner of release of affective 
charge must be ego-controlled. He 
must also be able to interpret and in- 


tegrate the affective behavior of 
others. Second, the world must not 
be so firmly fixed and structured in 
his ego that he can not be moved or 
partially influenced by the particular 
configuration of the environment. To 
a certain extent, his affective be- 
havior towards others should not 
continually require meditation and 
deliberation. Third, affective reac- 
tions directed towards the environ- 
ment by the individual cannot be 
overly gross and undifferentiated, nor 
can his perception of and reaction to 
the affective behavior of others be un- 
differentiated or gross. The type of 
color response on the Rorschach Test 
representing each of the above de- 
lineations is obvious: the first by FC; 
the second by CF; the third by C. 


The Normal Adult 


Rorschach findings. What is found 
by Rorschach examination of normal 
individuals? There are several 
sources of information, among them 
the standard texts and normative 
studies by Hertz already cited. 
Klopfer and Kelley (54) suggest that 
the normal adult should give some 
color responses, but that the sum 
weight of C and CF responses should 
not be higher than the sum weight of 
FC responses. They feel that a crude 
C response, one which is not descrip- 
tive or symbolic, is a pathological 
sign. (This view is not shared by 
Hertz.) Beck (11, 12) gives no spe- 
cific norms for any group, but pre- 
sents the psychological significance of 
the various types of color responses 
and suggests some individuals who 
would give them. The sign of the 
healthy individual is FC. Such an in- 
dividual is mature and can establish 
affective relations with other indi- 
viduals. This individual may give a 
number of, or a few CF responses, but 
the sum weight of the FC responses 
should approximately equal or exceed 
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the CF+C sum weight. C is sugges- 
tive of regression. 

In a major normative study by 
Hertz and Baker (42) it was found 
that an average of 3.7 color responses 
is given by 15-year-old boys and girls. 
No averages are given for CF or C re- 
sponses. However, the range for CF 
responses is 0-2; the range for C re- 
sponses is 0-1. Thus, one C response 
would not be considered pathological. 
The average sum weight of color re- 
sponses is 2.8. Balance in favor of 
FC is indicated by the weighted ratio 
FC—(CF+C), which is +0.54. The 
implication is clear that Hertz and 
Baker would expect normal adults to 
have at least as balanced a ratio in 
favor of FC, or a higher one. Steinzor 
(84), in testing a presumably normal 
college group, found a balance in 
favor of FC. Ina healthy individual, 
a sign of integration is a balance in 
favor of FC, with a total color weight 
of 3.0. 

Beck, et al. report a normative 
study based on a fairly large number 
of adults. Concerning the color fac- 
tors, they conclude: 


Of most interest is the weighting in the 
direction of CF; and FC, in that order; 
and the comparatively small instance of 
undiluted C. The population of which 
this sample is representative may, there- 
fore, in respect to affectivity, be described 
as having made some progress towards 
maturity and towards capacity for social 
rapport. Yet they are slightly more 
labile than fully stabilized. On the other 
hand, the quantity of infantile egocen- 
tricity is relatively small... unstable, 
easily excited, but resisting undisciplined 
violence (13, p. 259). 


The reader may wonder whether 
the theory has after all been formu- 
lated incorrectly, with this sudden 
finding of a predominance of CF over 
FC. However, there is another ex- 
planation which, to the writer, seems 
rather feasible, It may be, as Beck 


suggests later in the study, that the 
dynamic configuration of the indi- 
vidual and the society has changed. 
The color factors reflect this change. 
A clarification may result if one at- 
tributes to these findings the psycho- 
logical significance formulated by the 
theory stated in the present paper, 
without allusions to maturity or in- 
fantile reactions. The presence of 
FC indicates a capacity for ego-con- 
trolled affectivity and the capacity 
to integrate and interpret the affec- 
tive behavior of others. The excess of 
CF over FC indicates that the indi- 
vidual may have an immediate reac- 
tion to the environment and may be 
considerably influenced by it. The 
very small instance of C is a counter- 
sign against gross and undifferenti- 
ated affective behavior and percep- 
tion. In other words, although the 
present adult may react and respond 
far more readily to his environment 
than he once did—if the earlier diag- 
noses were correct—he still has the 
capacity for constructive affective be- 
havior. 

Studies involving painting. The ma- 
jority of studies involving easel and 
fingerpainting had children as sub- 
jects. However, Waehner (87), from 
his work with college students, de- 
veloped several indices in regard to 
emotional balance, control, compul- 
sion, constriction, etc. In construct- 
ing these indices, Waehner deliber- 
ately paralleled to a considerable ex- 
tent certain Rorschach procedures. 
Superior emotional balance is re- 
flected by a color variety of three to 
six, and a relationship of color to 
form of 5C:4F. Constriction is indi- 
cated by a small variety of or no 
color. 

Napoli (65, 66) will not be dis- 
cussed here since he is more con- 
cerned with the particular hues 
rather than color in general. Those 
interested in a broad survey of paint- 





46 ROBERT H. FORTIER 


ing are referred to a recent article by 
Precker (72). 

The Mosaic Test. This instrument 
has been gaining in interest among 
individuals with varied approaches 
to study of personality, but as yet 
few articles have been published. 
Wertham and Golden (92), while 
more concerned with the forms of the 
designs reproduced, expect normal 
individuals to produce designs har- 
monious in color and distinct in con- 
figuration. Diamond and Schmale 
(26) say that normals may have a 
very wide range in the use of color 
from primitive and crude designs in 
color to extremely artistic use of 
color. In comparing data obtained 
from the Mosaic Test with that ob- 
tained from the Rorschach test, the 
authors conclude that there is a tre- 
mendous discrepancy between the re- 
sults obtained by the two instru- 
ments. This discrepancy and some 
possible theory underlying the vary- 
ing performances obtained with the 
two instruments will be elaborated 
when psychotic modes of adjustment 
are discussed later in this paper. 
Lowenfield (60) confirms Diamond 
and Schmale’s (26) findings on nor- 
mals. 


A Genetic Approach 


Many investigators feel that a cor- 
relate of increasing chronological age 
is increasing emotional control. From 
an examination of genetic studies, 
therefore, one should be able to de- 
termine, first, how the individual per- 
forms at different ages, and second, 
what is the psychological significance 
of this performance. Jersild reviews 
a number of studies of emotional de- 
velopment and concludes: ‘‘The data 
now available from direct observa- 
tion or experimental study do not 
provide the basis for a systematic 
account of normal and immature 
emotional behavior at various age 


levels’ (45, p. 760). Therefore, what 
material can be cited here is admit- 
tedly sketchy and incomplete. 

Pratt, Nelson, and Sun (in 45), asa 
result of their studies of neonates and 
slightly older children, stress the 
point that “generalized reactions pre- 
dominate over specific reactions in 
early childhood and the fact that dis- 
tinctive patterns are difficult to de- 
tect.’’ Taylor (in 45) also stresses the 
undifferentiated nature of emotional 
reactions in a study of children aged 
one to twelve days. Sherman (in 45) 
arrives at a similar conclusion and ex- 
presses the belief that “‘with the pas- 
sage of time the child’s behavior be- 
comes increasingly differentiated and 
adaptive.”’ Jersild (45) points out 
that the manifest emotional behavior 
may change through the changing na- 
ture of the emotional problems with 
which the child is confronted. 

Gesell (32) comments briefly on the 
problem of emotional development. 
We may compare what he says with 
what may be inferred from other 
sources. 


The three year old attempts to con- 
form and please, ‘‘as though he were sen- 
sitive to the demands of the culture.” 
Suggestions are accepted more readily. 
He may prefer the companionship of 
other children but as yet is incapable of 
verbalizing his desires. He can play with 
children for a while, but may suddenly 
attack them. He is at least somewhat 
susceptible to social suggestion. He stud- 
ies the facial expressions of individuals in 
his environment and attempts to inter- 
pret them. ‘‘He is capable of sympathy.”’ 
Similar sketches are drawn for the four- 
and five-year-old. The progress made by 
the three-year-old child is continued, but 
the pattern may change slightly. ‘‘Three 
has a conforming mind. Four has a lively 
mind. Three is assentive; four assertive.” 
Five shows much more definiteness, con- 
creteness. Gesell calls this age a plateau. 


The Rorschach findings. Unfortu- 
nately, the sources of information are 
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again meager. Klopfer and Marguiles 
(55) made a study of children aged 
two through six years. They present 
percentages of children in each age 
group who use the various color scor- 
ing categories, and the average num- 
ber of such responses given by each 
child. The results on the color factors 
are reproduced in Table 1. The au- 
thors’ findings that at the age of six 
FC dominated the types of color re- 
sponses should be particularly noted. 

A second group of children, aged 
three through seven years, was stud- 
ied several years later by Ford (30). 
To facilitate interpretation and com- 
parison with the results obtained by 
Klopfer and Marguiles, a summary 
of her results on the color factors is 
presented in Table 1. 


TABLE 1 


PERCENTAGE OF CHILDREN OF DIFFER- 

ENT AGE LEVELS GIVING COoLoR RE- 

SPONSES AND THE AVERAGE NUMBER OF 
COLOR RESPONSES GIVEN 


Ages FC CF C* FC CF 


GC Cel 


(After Klopfer) 


\, a. re 
46 .87 «41 
30. =—.88 «1.24 
41 1.29 1.44 
30 2.08 .78 


(After Ford) 


28 32. .i2 
44 56 15 
56 36 36 
S77 6: 3 ‘ 1. 
52 St 4 1. - 


| . . . . 
; NM hv tye 


* In terms of per cent. 
+ In terms of average number. 


A principal source of discrepancy, 
at least in reporting the percentages 
of children giving the responses, is 
derived from the fact that Ford very 


carefully distinguished between pure 
C responses and color naming (Cn), 
that is, where an individual simply 
gave the name of the color. Klopfer 
and Marguiles made no such distinc- 
tion in reporting percentages. 

An explanation of the diverging re- 
sults obtained by Ford and Klopfer 
and Marguiles is difficult to find. 
Both studies were based upon chil- 
dren who could be expected to have 
comparable socioeconomic _ back- 
grounds. One source of variance may 
lie in the particular scoring bias of the 
different investigators. The records 
used by Klopfer and Marguiles were 
submitted by a number of different 
investigators, however, and several 
other investigators confirmed the 
findings by Ford. It may be that a 
difference in time—the year in which 
the investigations were made—is a 
contributing factor. The study by 
Klopfer and Marguiles was reported 
in 1941; the study by Ford in 1946. 
Obviously both investigations took 
much time to prepare. If it can be de- 
termined that the study by Klopfer 
and Marguiles antedated by any 
great period of time that by Ford, 
this difference, when interpreted in 
the light of the recent study by Beck 
(13), may point to a fairly rapid 
altering in the dynamic configuration 
of the individual and the society. 
This is particularly true of the dis- 
crepancy obtained in regard to the 
dominance of FC.‘ 

Studies involving painting. The text 
by Alschuler and Hattwick (3) un- 
doubtedly represents the most com- 


‘The mounting percentages of children 
giving color responses associated with increas- 
ing age are readily explained if one recalls 
Schachtel’s (77) injunction that affects are 
the conscious representations of drives. Ob- 
viously, a child becomes more aware of his 
drives with increasing age. Therefore, a very 
young child giving many color responses is 
showing signs of ‘‘precocity” rather than “‘in- 
fantilism.”’ 
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plete and recent major work con- 
cerned with painting and children. 
The authors’ findings tend to confirm 
Rorschach test results with children. 
Children of three or four are quite in- 
terested in color and tend to use it 
without great regard for form. With 
increasing age, the color tends to be- 
come more and more integrated with 
form. In studying groups of children, 
one of which uses a great deal of color 
and one of which is more concerned 
with form, the authors found that chil- 
dren concerned with form were more 
self-controlled, more concerned with 
external stimuli, and had a higher fre- 
quency of reasoned (in contrast to 
impulsive) behavior than those using 
much color. 

Epstein and Schwartz (29) found 
that the number of colors used re- 
flects the emotional development of 
the child; those using under four 
colors having poor emotional de- 
velopment, lack of drive, and per- 
haps constriction. Overcontrol or re- 
tarded development is indicated by a 
predominance of form over color. 
That interest in and use of color de- 
clines after the child reaches a certain 
age was confirmed by Blum and 
Dragowitz (15). 

Thus it is seen that although an in- 
tegration of form with color in paint- 
ing—at least to a certain extent— 
may be expected from children who 
have acquired some facility with 
emotional control, there is a certain 
time lag involved as to when such 
integration occurs. A suggestion is 
that the two methods of studying 
emotional development—the  Ror- 
schach test and painting—may meas- 
ure different aspects of this develop- 
ment. Since the writer feels that, toa 
considerable extent, the Mosaic Test 
and easel painting are functionally 
comparable, a discussion of the dis- 
crepancies obtained through use of 


the Mosaic Test and the Rorschach 
test will be discussed later as already 
indicated. 


Contribution of the Genetic Approach 
and the Adult Picture 


The task is now to re-examine the 
psychological significance of the color 
factors in the light of the evidence 
presented in these two _ sections. 
Beck (12), in elaborating the psycho- 
logical significance of the pure C re- 
sponse, says: “This is the reaction 
mode of the infant, who does what he 
pleases—screams, demands _ food, 
kicks, voids without regard to time 
and place. Response to feelings is 
exclusive and instant” (p. 30). 
Jersild (45) holds such reactions as 
described by Beck as typical of very 
young children, neonates, and chil- 
dren up until perhaps the age of two. 
The studies reviewed by Jersild (45) 
definitely point out that emotional 
control rapidly increases with in- 
creasing age. A neonate obviously 
cannot be given a Rorschach test. If 
Ford's results (and those of investi- 
gators reporting similar findings) may 
be accepted as at least typical of a 
certain class of children, it is seen that 
three-year-old children as a group 
gave but 0.2 C responses. It then be- 
comes difficult to see upon what basis 
an interpretation such as Beck’s is 
founded. Gesell (32) paints a picture 
of the three-year-old as an individual 
who has gained considerably in emo- 
tional control, but who is still mark- 
edly dependent upon the desires and 
wishes of those in his environment. 
He is also prone upon occasions to at- 
tack quickly and violently those indi- 
viduals around him. He is interested 
in studying and trying to interpret 
the facial expressions of those around 
him. 

From this picture, something com- 
parable to Ford’s results could have 
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been expected. Of the three-year-old 
children, 28 per cent gave FC re- 
sponses. CF responses predominate, 
while C responses are last in both per 
cent and average number. The find- 
ing that CF predominates at this age 
level, coupled with the statements 
that the three-year-old child is ‘‘as- 
sentive’’ and is ‘. . . sensitive to the 
demands of the culture... .’’ (32, p. 
36) corroborates the interpretation 
of the CF response presented by the 
present writer. The statement by 
Gesell (32) that the five-year level 
constitutes a plateau coincides quite 
well with the finding by Ford that it 
is only at this age level that a pre- 
dominance of FC over CF is found. 
The significance of the C response 
also seems to mean what it was 
sketched as meaning in the introduc- 
tory theory—affective charge not 
controlled or directed by the ego. 
The significance of the Cn (color 
naming) response is still to be deter- 
mined. Anyone observing the rela- 
tions of young children and parents 
will find, during a certain phase in the 
child’s development, parents point- 
ing to different colored objects, re- 
gardless of their shapes, saying, ‘‘this 
is green, this is yellow,” etc. It there- 
fore seems logical that, when a young 
child is presented with a new type of 
game, he will point out certain areas 
on the card and say, “this is green, 
this is yellow,” etc. Or, if one as- 
sumes that the hypothesized relation 
between affect and color holds also in 
this situation, another interpretation 
is possible. Gesell (32) points out 
that the child studies the facial ex- 
pression of those around him and 
tries to interpret them. The interpre- 
tation may be of the nature of nosol- 
ogy: that is, the attempt may be to 
identify, without the ability to act 
upon the interpretation. Then, may 
not the child, after looking at his 


father, say to himself, ‘“‘he is angry,” 
or “he is sad,’”’ without having the 
ability to act constructively upon his 
identification? 

No contradiction is found between 
the psychological significance that is 
attributed to the color factors when 
used by the child and the psychologi- 
cal significance that is attributed to 
the color factors when used by the 
adult. The writer feels that if one is 
successful in defining the psychologi- 
cal significance of a variable, and this 
significance remains constant whether 
the variable is used by an adult or a 
child, it is better to use this definition 
than to attempt to define the variable 
in terms of one or the other chrono- 
logical referents. 


Nonpsychopathological Deviations 
The Institutionalized Child 


Turning from normal individuals 
raised and living in a normal environ- 
ment, attention can be focused upon 
those individuals—more specifically, 
children—who have spent the greater 
portion of their lives in institutions. 
As a result of a study by Goldfarb 
(34), it was found that more such 
children give the pure C response, 
and that more of them exhibit the 
unbalanced ratio of sum _ weight 
CF+C greater than FC weight. 
Goldfarb suggests that this is an indi- 
cation of the lessening of rational 
control and a greater emotional im- 
maturity. 


Goldfarb and 


Klopfer continue 
this analysis, concluding that: 


The institutional group thus shows de- 
ficiencies in rational cortrol, in more ab- 
stract forms of thinking, in drive for in- 
tellectual and social attainments, and in 
emotional maturity. In a group with 
such psychological tendencies one would, 
of course, expect problems involving rest- 
lessness, inability to concentrate, and 
poor adjustment. In addition, all of the 
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above listed Rorschach trends among the 
“institution” children are associated with 
an air of passivity (italics added). In 
other words, the children of this group 
give little of themselves though super- 
ficially they are adjusting to reality re- 
quirements (36, p. 93). 


The writer feels it significant that 
Goldfarb and Klopfer find an air of 
passivity in institutionalized  chil- 
dren, for were one to attribute the 
usual significance to an unbalanced 
CF+C ratio, a far from passive at- 
titude would be expected—vigor- 
ously negativistic, impulsive, willful, 
etc. What this ratio actually seems 
to suggest here is a greater inclination 
to be moved and swayed by the en- 
vironmental configuration in which 
the children find themselves. It sug- 
gests a more undifferentiated emo- 
tional approach to the environment, 
possibly a rather diffuse emotional 
reaction towards everyone’ with 
whom the children come into contact. 


Such behavior would logically follow 
from the nature of the institutions in 


which the children find themselves 
forced to live. Their behavior is in 
virtually every respect governed by 
more or less impersonal rules and 
regulations. Their relations with 
adults are limited simply because 
there are so few adults in the insti- 
tutions that personal contact is quite 
difficult. The suggestion follows 
naturally that for one to have suffi- 
cient intellectual control of affectiv- 
ity one must have the opportunity to 
learn and develop this capacity. It 
is probable, then, that the nature of 
one’s afiective life is dependent, 
above native endowment, perhaps, 
upon the ability and opportunity to 
learn. 


The Delinquent 


In regard to this last postulate-— 
affective control and opportunity for 
learning—the findings of those in- 


vestigators concerned with juvenile 
delinquents may have some bearing. 
“Burt holds that marked emotional- 
ity is the most frequent and most in- 
fluential of all the psychological 
characteristics of the delinquent” 
(27, p. 129). This statement sums up 
succinctly an attitude towards the 
genesis of delinquency which pre- 
vailed for a considerable period of 
time, and perhaps prevails in certain 
quarters now, judging from the stud- 
ies still concerned with the relation 
between emotionality and_ delin- 
quency. 

Rorschach findings. The Rorschach 
findings may prove rather startling to 
those investigators still adhering to 
the classical theory of delinquency 
quoted above. Endacott (28) in a 
study of 100 delinquent boys—aver- 
age age, 14 years—found a restriction 
of color, lower FC, and lower CF 
when his results were compared to 
those of other investigators. The 
normative study by Hertz and Baker 
(42) suggests that a sum C weight of 
one to one and one-half points higher 
could be expected from boys of this 
age. Boynton and Walsworth (18) re- 
port a study of 47 delinquent voca- 
tional school (reform school) girls of 
approximately high school age. The 
authors compared the results ob- 
tained from the Rorschach protocols 
of the delinquent girls with those ob- 
tained from girls attending a high 
school located in a favorable section 
of the same town. In regard to color, 
the delinquent girls scored lower than 
the high school girls in all respects. 
The so-called impulsivity ratio was 
more in favor of CF+C in the high 
school group than in the delinquent 
group. A number of earlier studies 
reporting excessive emotionality in 
delinquents were reviewed’ by 
Schmidl (79) and criticized because 
of inadequate sampling and other fac- 
tors. He points out the suggestion by 
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Beck that delinquents can be either 
extratensive or introversive. 

These later Rorschach findings sug- 
gest that some cause other than 
marked emotionality must be postu- 
lated to explain the antisocial be- 
havior of delinquents. There is a sug- 
gestion that the inability to establish 
rapport (or lack of ego-controlled af- 
fective charges) cannot be accepted 
as a general factor in explaining de- 
linquency. Boynton and Walsworth 
(18) feel that one should be quite 
careful in using personality aberra- 
tions as explanations for delinquent 
behavior. Endacott sums up his find- 
ings by saying, ‘these are marks of a 
rigid, stiff-geared sort of personality 
that has been created to withstand 
strong pressures and frustrations” 
(28). 

The implication for affect or color 
theory is clear. If it'can be shown 
that a group having a somewhat 
lower use of color and a more stable 
color ratio than ‘“‘normal” individuals 
indulges in strong, ‘‘self-centered,”’ 
antisocial behavior, it is a logical 
deduction that it is not the affective 
relationship with the environment 
which should be postulated as a 
causative factor. There is a further 
indication that the presence of ego- 
controlled affectivity or ‘capacity 
for rapport’’ suggests nothing more 
concerning the individual than that 
he can direct and control his af- 
fective charges and can _ interpret 
and integrate the affective behavior 
of others. ‘Capacity for rapport”’ 
suggests nothing concerning the con- 
tent of the behavior of the indi- 
vidual. The delinquent, it would 
appear, is acting in accordance with 
his picture of the reality, i.e., in ac- 
cordance with his ego. A _ logical 
corollary of the above deduction is 
this: where antisocial, self-centered 
behavior appears concomitantly with 
extratensive, unbalanced use of color, 
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one should look beyond affect for an 
explanation of this behavior. 

A study using fingerpainting. It has 
often happened that where one 
method failed to provide insight into 
a particular problem, another method 
has succeeded at least partially. The 
method of approach to and the prod- 
ucts of fingerpaintings of delinquent 
and high school youths were com- 
pared by Phillips and Stromberg 
(69). While a number of significant 
differences are reported in their 
study, only one need be discussed 
here. Thirty-six per cent of the high 
school group used only one color on 
the first performance. Sixty-four per 
cent of the delinquents used only one 
color on the first performance. This 
difference is not quite significant at 
the .05 level. However, on the second 
performance, 4 per cent of the high 
school students used only one color, 
while 60 per cent of the delinquents 
continued to use only one color. The 
result of this comparison is highly 
significant statistically. 

If it can be assumed that one’s 
handling of color is indicative of his 
affective life, the nondelinquent, and 
the delinquent to a greater degree, 
might here be showing a certain 
amount of shock when confronted 
with a new-—and perhaps affective— 
situation, and thus respond in a some- 
what stereotyped manner. However, 
the nondelinquent shows a consider- 
able degree of recoverability and a 
capacity for a wide variety of re- 
sponse. The delinquent, on the other 
hand, continues to show a stereo- 
typed reaction. When one recalls a 
deduction made from the evidence 
reviewed concerning institutionalized 
children, it is possible that the en- 
vironment in which delinquents live 
does not make it possible for them to 
learn a variety of emotional reactions. 
Thus, while it may not be any funda- 
mental lack of capacity for emotional 
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rapport or ego control of affect which 
contributes to their antisocial be- 
havior, it may be an inability of the 
delinquents to vary their emotional 
response. This suggestion accords 


with the reduced use of color in gen- 
eral on the Rorschach test and the 
reduced use of CF by delinquents. 


Psychopathological Deviations 


Investigators have found that con- 
siderable insight may be obtained 
into certain dynamic relationships 
and functions by studying individu- 
als exhibiting more or less psycho- 
pathological reactions. It is reasona- 
ble to expect that a comparable result 
may be obtained here by reviewing 
the behavior of such individuals to- 
wards color. 


Alcoholics 


Using the Rorschach test, Billig 
and Sullivan (14) found that the af- 
fective picture presented in the use of 
color by alcoholics is of considerable 
prognostic value. In reference to 
those alcoholics who over a period of 
time showed the least favorable prog- 
nosis, the authors conclude: “In 80% 
of the cases factors indicating impul- 
sive emotional behavior are stronger 
than those expressing smooth adjust- 
ment to environmental influences” 
(14, p. 124). However, if their table 
is accurate, the impulsive use of color 
is reflected primarily by the use of 
CF rather than C. There is a marked 
reduction in the appearance of FC, 
and consequently the color ratio is 
unbalanced in favor of CF. The au- 
thors feel that their results confirm a 
previous study by Bowman and felli- 
nek, who made the statement that 
“the chronic alcoholic shows a com- 
paratively weak restraint, poor men- 
tal poise and stability, difficulties in 
controlling his mood swings and de- 
sires, combined with a lack of atten- 
tion” (14, p. 124). 


FORTIER 


It is immediately apparent that the 
statement cited from the paper by 
Bowman and Jellinek presents an in- 
terpretation which is far more com- 
parable to that which would be made 
by the present writer (drawing upon 
the theoretical outline sketched) on 
the basis of the evidence presented by 
Billig and Sullivan than the one ac- 
tually made by the latter authors. 
The comparative lack of FC re- 
sponses among the alchoholics with 
a poor prognosis, coupled with the 
overabundance of CF _ responses, 
suggest that these alcoholics are 
particularly affected by the envi- 
ronmental configuration in which 
they are immersed and are lacking 
the capacity to integrate and control 
the affective drives which are there- 
fore readily aroused in them. The ap- 
pearance of pure C in these cases 
would simply add a more unfavora- 
ble touch by suggesting relatively 
undifferentiated and diffuse emo- 
tional reactions and interpretations. 


Enuretics 


Enuretic children under ten years 
of age were shown by Goldfarb to 
have a high sum C total ‘‘with a con- 
spicuous excess of CF and uncon- 
trolied C responses. Emotional de- 
velopment at a primitive, infantile, 
impulsive level is suggested’’ (33, p. 
30). Specific figures are not given. 
However, a glance at the studies by 
Ford (30) and Swift (86) show that, 
when color naming is excluded, pure 
C is not a particularly frequent type 
of response. To be sure, CF out- 
weighs FC, but some FC does ap- 
pear. A high frequency of C and CF 
responses does not seem to be typical 
of children of the ages studied so far, 
and considerable difficulty would be 
encountered in the task of determin- 
ing whether such a reaction is typical 
of infants in the technical sense of the 
term, i.e., from birth to two years, 





THE RESPONSE TO COLOR AND EGO FUNCTIONS 53 


For one to call this excessive use of 
unbalanced color infantile is to use 
an analogy which is dubious and 
which may never be confirmed. Even 
were the analogy correct, to say that 
something is infantile is not in itself 
particularly expressive because little 
concerning the dynamics of the func- 
tion involved is suggested. What this 
mode of usage of color by enuretics 
suggests (as has been shown in other 
cases) is a considerable degree of in- 
fluence by the environment, with a 
relatively diffuse reaction towards it, 
and a reduction in the ego control of 
affect. Study of the ego of the enu- 
retic might be more revealing of the 
dynamics of this particular dysfunc- 
tion, i.e., enuresis, than analysis of 
the affective factors alone. Then, 
one may infer that this undifferenti- 
ated type of reaction and excessive 
influence of the environment con- 
tribute to the development of the 
symptom. Goldfarb’s study does 
contribute to our knowledge of the 
ego content of the enuretic. Only the 
interpretation of the color factors is 
in question. 


The Hysteric 


Schafer (78) suggests that a char- 
acteristic of the hysteric is a pre- 
dominance of CF+C over FC in the 
Rorschach test. A further character- 
ization is a “minimization of active 
and independent ideation as a means 
of coping with problems’ (78, p. 
33). Such characterizations tend to 
substantiate the theory that a less 
integrated color response suggests a 
greater susceptibility to environmen- 
tal influence and a lessening control 
of one’s affect. 

The problem of egocentricity en- 
ters the picture. The present writer 
feels, as he earlier proposed, that the 
presence of relatively uncontrolled 
color is neither an indication of nor a 
countersign against egocentricity as 


it appears in hysterics and others. 
He feels that it would be much more 
feasible to regard the behavior of the 
hysteric which leads others to call 
him egocentric as a quality of the ego 
content, or picture of reality, of the 
hysteric rather than of his affective 
life. 


Schizophrenic Adjustment and the 
Rorschach Test 


Among the psychotic modes of ad- 
justment, that of schizophrenia has 
attracted the most attention. No 
detailed review of the dynamic pic- 
ture of the schizophrenic will be at- 
tempted since the task would be com- 
plicated by the belief held by many 
investigators that schizophrenia is 
not a single disease entity. Beck (9, 
11, 12), Kelley and Klopfer (49), 
Klopfer and Kelley (54), Rickers- 
Ovsiankina (73), Kisker (52), Stern 
and Malloy (85), Kendig (51), and 
others, feel that a characteristic of 
schizophrenia is an overwhelming 
imbalance in color responses in the 
direction of CF+C. The total color 
weight may be large or small. As 
Beck (9) suggests, affect is not absent 
or even negligible in the schizo- 
phrenic as was once felt to be the 
case. 

The color factors indicate first, ac- 
cording to the present writer’s intro- 
ductory scheme, that the schizo- 
phrenic is considerably influenced by 
the environmental configuration in 
which he finds himself; second, that 
his perception of affective life in 
others and his affective reaction to 
the environment are undifferentiated 
and gross. The inappropriate af- 
fective reaction frequently noted in 
the schizophrenic is explained by the 
finding of pure C in the record. If 
the schizophrenic cannot interpret 
correctly the nature of the affective 
situation with which he is confronted, 
this, coupled with the fact that he has 
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little evo control of his affectivity, 
certainly would make it rather a coin- 
cidence if appropriate emotional re- 
actions did result. 

To postulate that the schizophrenic 
is considerably influenced by the en- 
vironmental configuration may de- 
viate somewhat from the typical con- 
cept of the schizophrenic as sepa- 
rated from reality. The mere fact 
that the schizophrenic is consider- 
ably influenced by the environment 
does not indicate that his reactions to 
the environment will be realistic. 
The content of his behavior is de- 
termined by the content of his ego. 
It is agreed that the ego of the schizo- 
phrenic contains far from a realistic 
conception of the environment. 
Therefore, this environmental influ- 
ence operating upon a distorted pic- 
ture of reality would only add to the 
confusion and bizarre reactions of the 
schizophrenic. The susceptibility to 
environmental influence might ex- 
plain the finding by several clinicians 
that a great deal of what occurs 
around the schizophrenic in a cata- 
tonic stupor is frequently remem- 
bered by the schizophrenic when he 
recovers from the stupor. 


Adjustment and the 
The Suggestion of a 


Schizophrenic 
Mosaic Test: 
Theory 


Diamond and Schmale’s (26) in- 
vestigation of schizophrenia by use of 
the Mosaic Test, concerned with the 
dynamics of the schizophrenic, theory 
of the Mosaic Test, and affect-color 
theory, may prove to be of consider- 
able value. The authors found that 
the schizophrenic completely disre- 
garded the color of the pieces in the 
construction of his design. 


The color defects of the schizophrenic 
deserve special discussion. Color rejec- 
tion or color disregard appears very early 
in this disease even though the personal- 
itv and the Mosaic pattern are seemingly 


well-integrated. ... The Mosaic pattern 
is exactly as if it were constructed by a 
totally color blind individual. It might 
be called a psychological color blindness. 

It was very difficult to compare the 
color responses to the Rorschach and the 
Mosaic Tests in individual cases, and lit- 
tle consistency between the two were 
shown (p. 246). 

An obvious source of this incon- 
sistency with the two tests is in their 
basic dynamics. The Rorschach ink- 
blots are often called unstructured, 
ambiguous, undefined. This is true, 
but only in a certain sense. They do 
exist; they do have very definite and 
unchanging configurations. No mat- 
ter how the subject looks at the blots, 
there is no change in the actual shape 
of the blots. The Mosaic Test is com- 
pletely different. Before the subject 
are a large number of little blocks of 
many different shapes and colors. 
He moves them around and can put 
them back together in a multitude of 
different ways. (It can be seen that 
the functioning of an individual in 
regard to easel- or fingerpainting is 
comparable to what it is with the 
Mosaic Test. The writer feels that 
the task presented by the Mosaic 
Test is more difficult than that pre- 
sented by finger- or other painting.) 

Many investigators feel that the 
task demanded of the individual con- 
fronted with the Rorschach test is 
essentially a creative one. The same 
can be said of the Mosaic Test. But 
is not the nature of the creative 
activity, the basic mechanism of the 
creative activity, extremely different 
from one test to the other? The 
creativity involved in the Rorschach 
test is exclusively an associational 
one. Upon being presented with an 
unchanging configuration, the indi- 
vidual is asked to call upon the con- 
tent of his ego for a concept which 
corresponds more or less to the actual 
shape of the configuration. The indi- 
vidual is not asked to alter reality. 





THE RESPONSE TO COLOR AND EGO FUNCTIONS 55 


He changes nothing in the external 
environment nor does he create any- 
thing in it. The task involved in the 
Mosaic Test is also associative, but 
it is far more than that. Before one 
coastructs a definite thing in the ex- 
ternal environment, he has a more or 
less defined image of that thing in his 
mind. The nature and variety of 
what is brought forth depends, as it 
does in the Rorschach test, upon the 
content of the ego and its associative 
facility. But the development of the 
concept or image must be followed 
by an alteration in the external en- 
vironment. The resulting product 
depends upon manipulative skill to 
some extent, but to a greater extent 
upon the individual’s capacity to 
translate his more or less well-de- 
fined image into concrete reality. 
How does this difference in 
basic mechanisms involved in 
two tests affect the product? 


the 
the 
The 


response to color only is of pertinence 
here. Consider the schizophrenic in- 
dividual whose capacity for ego-con- 


trolled and ego-oriented affective 
charges is considerably reduced, and 
whose emotional responses are be- 
coming increasingly gross and undif- 
ferentiated. On a piece of cardboard 
before him, he sees a blob of color. 
Not much capacity or skill is required 
for this individual to call forth a 
vague, structureless association. 
More often than not, if he succeeds 
in developing a structured response, 
the response will not fit the configur- 
ation confronting him. 

In the Mosaic Test, for the indi- 
vidual to handle color properly— 
that is, if he is not to ignore it—he 
must first be able to visualize or con- 
ceive an image or configuration in- 
volving color and then reproduce 
this image by an alteration of the 
external environment. An individual 
whose capacity for ego-controlled af- 
fectivity is greatly reduced would 


have much difficulty in conceiving an 
affective situation and more in ma- 
nipulating it in the external environ- 
ment. The product of this individual 
would, of course, show not only a 
disregard of the color of the Mosaic 
blocks, but very poor form as well. 

But an individual who is not quite 
sure of himself, who is becoming 
aware that something about his 
handling of affectively charged situa- 
tions is not quite right, would, even 
if he could conceive a fairly adequate 
affectively charged image, hesitate 
to bring this image forth, to repro- 
duce it in the external reality where 
it would be visible not only to him- 
self but to others. Since Diamond 
and Schmale (26) found that even in 
the very early stages of schizophrenia 
color was ignored while the capacity 
to integrate form was relatively in- 
tact, the suggestion is strong that one 
of the very first signs of the schizo- 
phrenic process is an uncertainty, 
perhaps even consciously realized, 
that one is losing his capacity to 
handle affective situations. 

The writer thinks it will be agreed 
that capacity to handle emotionally 
toned situations will vary among in- 
dividuals who do possess facility to 
conceive and interpret such situa- 
tions. If this is so, and if the writer's 
analysis of the process underlying 
the Mosaic Test is correct, it is not 
difficult to see why results obtained 
in one test are not comparable to 
those obtained in the other. Thus 
there Are two aspects to affective 
situations: the ability to conceive and 
interpret them; and the ability to 
handle them in the external environ- 
ment. The reasoning followed here 
tends to be confirmed by a study of 
institutionalized children conducted 
by Colm (24). These children also 
used color indiscriminately. It is ob- 
vious that institutionalized children 
have limited scope and opportunity 
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for learning to handle affective sit- 
uations. 


The Feebleminded Individual 


Davidson and Klopfer (25), Kel- 
ley (48), Abel (1), and Werner (90) 
agree that on the Rorschach test the 
mental defective, although he may 
use less color, uses it in an unbal- 
anced fashion, i.e., CF4+C greater 
than FC. Abel (1), in studying de- 
fectives showing the least inclination 
to succeed in school, found them to 
give more such unbalanced records 
than successful defectives. He con- 
cluded that the former are more sus- 
ceptible to ‘‘stimulation from the ex- 
ternal environment without adequate 
control of the situation.”’ 


Studies of Individuals Having Cere- 
bral Disorders 


Epileptics. Guirdham (38), Arluck 
(6), and Kelley (50) agree that the 


epileptic gives fewer color responses 
on the Rorschach test than the nor- 


mal individual. However, these re- 
sponses are definitely unbalanced in 
favor of CF+C. The suggestion is 
that the epileptic has little emotional 
communication with the environ- 
ment, and that such communication 
as does occur is not under ego con- 
trol, tending to be very gross and 
undifferentiated. It is of consider- 
able significance that Drohocki (cited 
in 70) found, upon repeated exam- 
ination of epileptics beginning imme- 
diately after seizure, an extremely 
dilated picture, with evidently a 
number of color responses, the ma- 
jority of which were unbalanced in 
favor of CF+C. Stainbrook (82) 
found somewhat similar occurrences 
however, with first the appearance 
of Cn, then CF and FC. 

The findings of Drohocki and of 
Stainbrook were emphasized. Could 
they not suggest something such as 
the following? It is reasonable to 


suppose that, immediately following 
a convulsion, the synaptic connec- 
tions within the brain are weakened, 
distended. A virtually physical 
increase in ego permeability would 
thus occur. At the least, one must 
admit that individuals having just 
experienced a severe convulsion would 
be dazed and would have less control 
of affective drives; would of necessity 
respond to the environment and per- 
ceive it in a rather gross, undifferen- 
tiated fashion. 

Brain-injured individuals. Where 
actual destruction of brain tissue is 
known to have occurred, most in- 
vestigators (53, 64, 70, 78, 91) find 
that on the Rorschach test ‘“organ- 
ics’’ present a picture of extratensive- 
ness, with CF+C responses pre- 
dominating over FC. There is an 
additional indication that may be 
present where the other is not—color 
naming. The functions that color 
naming may play in the dynamics of 
the child were previously discussed: 
color naming may be a rather con- 
crete response evoked by the practice 
of parents teaching their children 
the names of various colors by point- 
ing to an object and simply giving 
the name of the surface color; or it 
might be due to an attempt on the 
part of the child to designate an emo- 
tional situation without being able 
to act on his interpretation. It is 
doubtful which, if either, of these 
interpretations is applicable to the 
brain-injured individual. If it is the 
former, this is an indication of the 
adoption of an ineffective concrete 
attitude. If it is the latter, a be- 
wilderment, an inability to integrate 
and resolve affective situations, may 
be indicated. 

A further contribution. Bychowski 
(22) has published a paper resulting 
from his observations and study of 
individuals undergoing treatment 
and training in a rehabilitation center 
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for the brain injured. He found that, 
while certain physiological changes 
resulting from the injury might not 
be alleviated, a considerable im- 
provement in the psychological be- 
havior of the individual frequently 
resulted from the retraining given by 
the clinic. It is as if the adaptive and 
emotional skills of the individual had 
been destroyed or eliminated at least 
temporarily by the injury. It was 
necessary for the individual to re- 
learn, under very careful guidance 
and supervision, adaptive and emo- 
tional skills. The writer feels that 
Bychowski’s report tends to confirm 
the postulates made in other sections 
of this paper about the relation be- 
tween the affective picture presented 
by the individual and his capacity 
and opportunity to learn. 


A Free Behavior Situation 


Young and Higginbotham (96) at- 
tempted to correlate certain Ror- 
schach factors with actual behavior 
in a free situation (summer camp). 
While a number of records corre- 
sponded approximately to the be- 
havior in which the boys indulged, 
there were some striking exceptions 
—such as the child who was seem- 
ingly the most excitable and impul- 
sive in the entire camp. This child 
had given no color responses on the 
Rorschach test, and the authors were 
inclined to question whether color 
was actually an indication of im- 
pulsivity; if it were, one might cer- 
tainly expect this child to use a 
great deal of color, with most of it 
being of the CF or C variety. The 
authors’ point was well taken. The 
present writer has expressed the 
opinion that impulsivity is not nec- 
essarily a concomitant of unbal- 
anced color. Schachtel (77) suggested 
that affects (and color) are conscious 
manifestations of instinctual and 
other drives. An obvious conclusion 


in the case of the particular individ- 
ual in question is that he has re- 
pressed his affective drives and is not 
even attempting to handle them ona 
conscious basis or level, or that he has 
never learned to recognize affective 
behavior in himself or others. From 
such an individual one would expect 
quite flighty and ‘impulsive’ ac- 
tions. 


The Strongest Challenge 


Siipola, in a highly original experi- 
ment concerned with the effect of 
color upon the responses of individ- 
uals taking the Rorschach test, con- 
cluded: ‘‘Apparently, the mere pres- 
ence of color in a blot does not en- 
dow it with magic affect-arousing 
properties”’ (81, p. 381). She did find, 
however, an increase in the number of 
emotional attitudes when the individ- 
ual was confronted by the colored 
blots and, as a result of the color, ‘‘a 
weak selective influence among form 
dominated concepts, and a strong 
disruptive influence involving symp- 
toms suggestive of conceptual con- 
flict and behavioral disorganization” 
(81, p. 381). 

Many individuals when confronted 
with an achromatic version of the 
blot gave exactly the same response 
given by those individuals confronted 
with the chromatic version. One 
example of such behavior was the re- 
sponse of ‘‘butterfly’’ to the middle 
red area of Card Iil. The present 
writer has found time and time again 
that some individuals when con- 
fronted with this blot in the chro- 
matic version will give the response 
of butterfly, and then when asked 
what contributed to the forming of 
the concept answer that it was only 
the form, just the shape. The color 
was actually ignored. There are 
many other individuals who do re- 
spond to the color of the blot and use 
the color in developing their con- 
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cepts. It is the response to color 
which is important, and not just the 
fact that the color is there to be re- 
sponded to by any except the color- 
blind individual. 

Siipola (81) also found that those 
blots which especially aroused emo- 
tional responses or emotionally 
charged reactions were those in which 
the color was particularly incongru- 
ous to the usual forms that were con- 
ceived in the achromatic version. 
She felt that it was this incongruity 
of color and form which aroused the 
emotional reaction rather than sim- 
ply the presence of color, for where 
there were no incongruities the emo- 
tional reactions did not occur. This 


finding ties in closely with her cer- 
tainly well-grounded statement that 
as yet no one has succeeded in bridg- 
ing the gap between color and affect 
data 
empirical 


than 
based 


by other 
or theory 
data. 
This last finding, that of emotional 
reactions being the product of color 
and form incongruity, may force a 
reconceptualization of the problem. 
Let it be granted, for the moment, 
that color in and of itself has no ef- 
fect. Its mere presence is of no im- 
portance. However, some individuals 
will respond to the colored inkblot 
by incorporating the color into a 
concept. There are other individuals 
who will look at this same colored 
inkblot without incorporating the 
color into a concept. The point that 
affect-color theorists and empiricists 
wish to make is that the individual 
who does respond to color is in some 
way different from the individual who 
does not do so. Further, these theo- 
rists and empiricists feel that the 
manner in which the color is used is 
a very keen tool for analyzing the in- 
dividual’s affective life, and for gain- 
ing some insight into his ego func- 
tioning. At the present time, more 


empirical 
upon 


FORTIER 


information exists upon the latter 
phase (as this review shows) than 
the former. A suggestion is that 
those individuals who do respond to 
color show more permeability, more 
susceptibility to influence by the en- 
vironmental configuration than those 
individuals responding little or not 
at all to color. 

Can it be granted that the mere 
presence of color has no effect upon 
the organism? Goldstein’s well-known 
experiment with brain-injured indi- 
viduals (37) seems to indicate clearly 
that exposing color to such individ- 
uals may markedly influence their 
physiological functioning. A study 
by Baccino (7) rather definitely indi- 
cates that chromatic illumination has 
a rather profound influence on the 
physiological functioning and growth 
of certain animal organisms. Con- 
versely, the suggestion is strong that 
this strong response to color is ac- 
tually due to definite brain damage 
resulting in decreasing effectiveness 
of inhibitory centers (91). 

An experiment by Kravkov is 
highly provocative. This investigator 
injected into humans certain drugs 
which were known to have an effect 
upon the autonomic nervous system. 
He then compared the sensitivity of 
the eye to various colors. 

“Changes in color sensation indi- 
cate a definite regularity depending 
upon the portion of the vegetative 
nervous system which is chiefly stim- 
ulated. Thus (use of) sympathetic 
toxins... bring about an increase 
in color sensitivity with respect to the 
green-blue rays of the spectrum and 
in contrast lower the color sensitiv- 
ity with respect to the orange-red 
rays. The utilization of parasym- 
pathetic toxins brings about an in- 
crease in the sensitivity to orange- 
red rays and lowers the sensitivity to 
green-blue rays’’ (56, p. 94; trans- 
lated from the Russian). 
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While these particular findings are 
not of tremendous importance here, 
the writer feels that the fact that 
alterations in the autonomic nervous 
system can cause a change in the be- 
havior of individuals towards color 
is of great importance. Recently, 


there has been accumulating a con- 
siderable body of evidence to suggest 
that the condition of the autonomic 
nervous system may have profound 
effect upon behavior (31, 89). 


A RETURN TO THE THEORY 


There has been sketched in this 
paper a theory of the relation of the 
response to color and _ personality 
dynamics. Two papers, one by 
Schachtel (77), the other by Rickers- 
Ovsiankina (74), contributed heavily 
to its formulation. The theory with 
some of its ramifications is briefly 
stated below. 

The response to color is indicative 
of a certain functioning of the ego: 
the relation of the ego to its external 
environment, its degree of communi- 
cation with, and readiness to respond 
to it. One may determine much con- 
cerning the affective life of an indi- 
vidual through an analysis of his 
response to color: his control of af- 
fective charges, his capacity to inter- 
pret and integrate the affective be- 
havior of others. But Rickers- 
Ovsiankina (74) speaks of the per- 
meability of the ego. Does this per- 
meability refer solely to affect or 
could it also be extended to cover 
intellective functions as well? Is 
there actually a rigid demarcation 
between an inividu+l’s affective and 
intellective life areas? Can an individ- 
ual be shut off affectively from his 
environment and still participate in- 
tellectively with it? 

One of the personality patterns 
presented by Beck (12) may shed 
illumination. Beck draws a picture 
of a university president, a skilled 


scientist who has made valuable con- 
tributions in many areas. This in- 
dividual gave nine FC and five CF re- 
sponses. Interpreted according to 
the theory presented here, this indi- 
vidual is exceptionally responsive to 
the environment in which he lives, 
but he also has tremendous capacity 
and power to integrate and control 
this responsiveness. 

Whence comes the material with 
which we create but the environ- 
ment? If the degree of communica- 
tion with the environment is limited, 
then the maierial with which to 
create is limited. An individual who 
has limited communication with his 
environment may be able to do much 
with what he has. But does not the 
individual with considerable environ- 
mental communication, granted the 
capacity to control and integrate his 
responsiveness, have a tremendous 
advantage over his fellowman who 
has not this responsiveness? 

A second ramification concerns the 
clinician or other investigator seeking 
to learn of the affective life of an in- 
dividual and his environmental re- 
sponsiveness. A point made by 
Schachtel (77) must be continually 
kept in mind. Affect, and color, is 
the conscious manifestation of in- 
stinctual and other drives. If the in- 
dividual has no or little experience of 
affect or of color, the possible infer- 
ences are three: (a) he may have an 
inherently limited capacity for affec- 
tive experience, (6) he may be re- 
pressing affective experience, or (c) 
he may never have learned to express 
his affective drives. If the second 
condition exists, then the individual 
is not dealing with his affective life 
on a conscious level, and one should 
not expect to find representations of 
affective life, such as the use of color, 
in tests such as the Rorschach and in 
similar situations. This condition 
may have been operative in the case 
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of the ‘impulsive’ individual de- 
scribed by Young and Higginbotham 
(96). If the third condition exists, 
one should anticipate evidence of 
limited and perhaps stereotyped af- 
fective experience. This third condi- 
tion the writer feels to be particu- 
larly exemplified by the juvenile de- 
linquent (69), although the second 
condition could also be operative. 
The selection by the clinician of the 
particular condition which is in effect 
must rest upon an analysis of other 
factors and a careful case history. 

A caution must be made. For the 
affect-color theory to be functionally 
correct, there need not be any emo- 
tional or affective reaction to color. 
It is the response to color which is 
important. There need be no “‘magic 
affect-arousing properties’’ of color. 
There may be affect-arousing proper- 
ties and emotional or affective reac- 
tions to color may occur, but these 
factors are not ingredients of this 
theory. 


SUMMARY 


1. A theory concerning the nature 
of the relation of the response to color 
and personality dynamics was pre- 
sented. The theory strongly sug- 
gests that much can be learned from 
the response to color by the individ- 
ual concerning the nature of the 
relation of the ego to the external 
environment as well as the relation 
of the ego to the affective drives of 
the individual. 

2. A number of studies, drawn 
principally from the large body of 
Rorschach data, but also including 
several based upon the Mosaic Test 
and easel painting, were reviewed. 
The writer feels, first, that the theory 
is substantiatéd by the articles re- 
viewed, and second, that a clarifica- 
tion of the dynamics of certain nor- 
mal as well as disease processes re- 
sults when the theory presented here 
is used in interpretation rather than 
prevalent practices. 
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BAUER, RAYMOND A. The new man in 
Soviet psychology. Foreword by 
Jerome S. Bruner. Cambridge, 
Mass.: Harvard Univer. Press, 
1952. Pp. xxiii+229. $4.00. 


In view of the very real curiosity— 
to use a mild term—which most of us 
. feel regarding events on the other 
side of the semipermeable (?) mem- 
brane which separates us from the 
Russians, this littke volume should 
be warmly welcomed by American 
psychologists. Since so few of us have 
direct access, for variois reasons, to 
the Soviet psychological literature, 
and since contacts with our Russian 
colleagues have been reduced to the 
vanishing point, we are all indebted 
to Bauer for bringing us relatively 
up to date on recent developments. 

From the point of view of scientific 
psychology the picture which Bauer 
presents is generally discouraging. 
No journal, specifically psychological 
in nature, has appeared since 1934; 
articles of psychological interest are 
published mainly in a journal devoted 
to pedagogy. There has been a strong 
reaction, still evident, against psy- 
chology as an independent discipline. 
The study of attitudes has been con- 
demned and virtually abandoned. 
No public opinion surveys may be 
conducted, and social psychology in 
general ‘‘has become virtually a pro- 
scribed area’ (p. 169). A Party de- 
cree in 1936 resulted in the almost 
complete suppression of the use of 
psychological tests, which were “‘for- 
mallycharacterized as instruments for 
perpetuating the class structure of 
bourgeois societies’ (p. 124). Scien- 
tific theory, in psychology as else- 
where, is validated not in terms of its 
relation to empirically verified facts, 
but by the contribution it can make 
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to the Party’s program. A psycholo- 
gist is quoted as stating that “every 
theoretical mistake, every error in 
the field of methodology is inescap- 
ably transferred into a_ political 
error’ (p. 106). It goes without say- 
ing that ‘‘incorrect’’ views cannot be 
expressed or tolerated. 

Bauer characterizes his book as 
“partially a history of the science of 
psychology in the Soviet Union, 
partially a study of the pattern of 
social change in that country, largely 
an analysis of changing conceptions 
of human nature under conditions of 
social change, to a certain extent an 
inquiry in the relation of ideology to 
action, somewhat a study of the re- 
lationship of psychology to society” 
(p. ix). On the whole this ambitious 
program is effectively realized. There 
is, of course, a close relation between 
the character and structure of a 
society and the current beliefs con- 
cerning the nature of man, his goals 
and motives, his development and 
socialization. Bauer has presented 
some striking correspondences be- 
tween the character of the society as 
a whole, and the specific develop- 
ments that have occurred in the 
field of psychology. 

Nowhere does this come out more 
clearly than in his discussion of the 
changes which have occurred in the 
concept of the nature of man, as a 
reflection of the changes which took 
place in the political scene. In the 
1920's, for example, Soviet psychol- 
ogists proceeded on the assumption 
that man’s nature was essentially 
passive, his characteristics deter- 
mined by the (mainly economic) en- 
vironment. Fundamentally good and 
noble, man had been misled and per- 
verted by the evil (that is, bourgeois, 
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capitalist) system under which he 
lived. Even after the Bolshevik revo- 
lution, the environment could still be 
held responsible, because it contained 
many of the elements of the older 
social and economic structure. By 
1936, however, socialism had alleg- 
edly been fully realized in the Soviet 
Union. From, that time on, it became 
impossible to blame the (socialist) 
environment; the responsibility was 
now placed upon the individual him- 
self. ‘The dominant conception of 
man became that of an increasingly 
purposeful being, who was more and 
more the master of his own fate, and 
less and less the creature of his en- 
vironment”’ (p. 7). 

That meant a movement away 
from behaviorism, reflexology, and 
“‘reactology”’ to a more ‘‘purposive”’ 
variety of psychology. Consciousness 
was restored to a dominant role in 
human affairs, and the unconscious 
fell correspondingly into disfavor; not 
that its existence was denied, but 
rather that it became subordinate in 
importance to conscious, purposive 
action. The source of error was now 
found at least in part in man himself; 
he needed the right training, and the 
right self-training, to set him upon 
the proper road. From the point of 
view of Western psychology, train- 
ing is of course a part of what we 
would call the environment. Bauer 
suggests that this distinction is drawn 
by Soviet psychologists ‘mainly to 
deprecate the importance of such 
aspects of the environment as the 
actual material conditions under 
which the child lives’ (p. 148). The 
question arises as to whether the 
words for “environment” have a 
somewhat different connotation in 
Russian and English respectively, in 
which case the Soviet attack on en- 
vironmental explanations of behavior 
would really represent an attack on a 
very restricted variety of environ- 


mentalism. In any case, the coinci- 
dence between the political pro- 
nouncements of 1936 and the attack 
of the prevailing science of psychol- 
ogy is a striking one. 

Bauer draws an interesting con- 
trast between the Soviet view of man 
and that prevalent in Nazi Germany. 
The Nazi view held that man was 
moved primarily by the unconscious, 
the nonrational; the Bolshevik stresses 
consciousness and rationality. The 
Nazi stressed man’s weakness, his 
helplessness, his need of a leader; the 
Bolsheviks insist on man’s responsi- 
bility for his behavior, on his ability 
to make his own destiny—though the 
only “right’’ destiny is that which fol- 
lows the party line. ‘For the Nazi, 
man was a marionette who moved 
when one pulled the strings. For the 
Bolshevik, he is a robot who can be 
trained to act independently within 
specified limits” (p. 178). One may 
argue about some of the details of 
these characterizations, but it seems 
clear that the two dictatorships do 
differ markedly in the meanings 
they attach to human nature. Per- 
haps we have here a clue to a differen- 
tial diagnosis of varieties of dictator- 
ship in psychological terms. 

Bauer’s informative study would 
have been still more valuable, at least 
in the opinion of this reviewer, if he 
had included somewhat more dis- 
cussion of some of the specific in- 
vestigations carried out by Russian 
psychologists. Granting that his 
interest was in the development of 
theory, his thesis could have been 
more clearly illuminated by a fuller 
demonstration of the manner and ex- 
tent to which theory dominated the 
collection of “facts.”” One might 
argue also about the amount of em- 
phasis which Bauer places on the dis- 
continuities to be found in Soviet 
psychology. It is interesting that Dr. 
Joseph Wortis in his Soviet Psychia- 
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try (Baltimore, 1950), as Bauer him- 
self indicates, was much more forcibly 
struck by the continuities. This dif- 
ference in interpretation by two 
scholars examining closely related 
material calls for fuller exploration 
than that contained in a brief foot- 
note. These are relatively minor 
issues, however, compared to the 
over-all value of Bauer’s study. The 
Russian Research Center at Harvard 
has made a significant addition to its 
excellent series. 

The last sentence of the book is 
worth repeating. ‘‘Political inter- 
ference in science does not destroy 
completely the usefulness of science 
to the system, but the continued sup- 
pression of freedom of scientific in- 
quiry must ultimately lead to the 
point where the society cannot solve 
its own problems effectively” (p. 
196). This applies in the United 
States just as it does elsewhere. It 
cannot be said too often: there can be 
no real development of science where 
there is no freedom—freedom to ex- 
plore, to doubt, to criticize, to devi- 
ate, even to be wrong. There are peo- 
ple in this country, too, who have 
taken it upon themselves to tell 
scientists, psychologists and others, 
what they may teach and what they 
may discover. Soviet psychology 
should serve us as an object lesson. 
If we allow that kind of interference 
here, we might just as well shut up 
shop. 


Otto KLINEBERG. 
Columbia University. 


JAQUES, ELtiotT. The changing cul- 
ture of a factory. New York: Dry- 
den Press, 1952. Pp. xxi+341. 
$4.25. 


This is the published report of 
‘, .. acase study of developments in 
the social life of one industrial com- 
munity between April, 1948 and 
November 1950." The ‘‘case”’ is a 
small, publicly held British company 


‘ 


engaged principally in the manu- 
facture, sale, and servicing of metal 
bearings. The study is concerned 
with the description, diagnosis, and 
treatment of the corporate syntality. 
The results reported are the product 
of the collaborative efforts of the per- 
sonnel of the company and of a thir- 
teen-member research team headed 
by the author of the book, Dr. 
Jaques. The research was sponsored 
by the Tavistock Institute of Human 
Relations, London, and has been ac- 
cepted as a Ph.D. thesis in the De- 
partment of Social Relations at 
Harvard. 

The first part provides a retro- 
spective glimpse of the corporate or- 
ganization as it evolved during the 
first fifty years of life. This is fol- 
lowed by the case study proper, a de- 
tailed description of events as they 
occurred during the period of ob- 
servation. The methods used by the 
research team to gain acceptance by 
company personnel at all levels and 
to function effectively in the multiple 
role of consultant, analyst, and thera- 
pist are a highlight of this second 
section of the report. The third and 
concluding part of the book contains 
an analysis and interpretation of the 
findings. ‘‘The method of analysis 
... will be to study how the pattern 
of social activity at Glacier (firm 
name) ...has come about through 
the interaction of the firm’s organi- 
zational structure, its customary wav 
of doing things, and the behavior of 
its members... we shall study the 
interaction of social structure, cul- 
ture, and personality.’”” The results 
presented in this part point up the 
need for defining and clarifying indi- 
vidual and group roles as an ante- 
cedent step both to understanding 
social behavior and to evaluating it. 
Inferentially, the adequacy of a 
group’s adjustment, in large measure, 
is considered to be a function of the 
members’ understanding of the au- 
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thority-responsibility relationship and 
possession of authority by the indi- 
vidual members commensurate with 
their felt responsibilities. 

The study exemplifies social sci- 
ence at its best, transcending the 
boundaries of any single professional 
research area. From the standpoint 
of both content and methodology it 
warrants the attention of all psy- 
chologists concerned with interper- 
sonal and intergroup relations as they 
affect the individual’s adjustment. 
Generalization from the specific find- 
ings is, of course, limited by the very 
nature of the case study method and, 
in this instance, by the concurrent 
diagnosis and counseling required of 
the investigators during the course of 
the study. 

WILLIAM J. E. Crissy. 

Queens College. 


Jupp, DEANE B. Color in business, 


science and industry. New York: 


Wiley, 1952. Pp. ix+401. $6.50. 


The author of Color in Business, 
Science and Industry seems to ad- 
dress himself primarily to business 
men and industrialists to call atten- 
tion to the scientific aids now avail- 
able for the solution of a variety of 
practical color problems. In the 
preface he says, “It has been my 
privilege ...in my twenty years at 
the National Bureau of Standards, 
to come into contact with hundreds 
of colorimetric sore spots in our in- 
dustrial life. I have seen victories 
that paid off in dollars and cents won 
by applying the sciences of mathe- 
matics, physics, and psychology to 
these problems.” The author refers 
specifically to a great number of 
practical color problems encountered 
in everyday life and endeavors to in- 
dicate how ‘‘visual psychophysics 
mixed with a liberal sprinkling of 
common sense’ can provide a solu- 
tion for these problems. 


The work is divided into three 


principal sections: Part I, Basic 
Facts; Part II, Tools and Technics; 
and Part III, Physics and Psycho- 
physics of Colorant Layers. Cursory 
inspection reveals that Part II is the 
most important. More than half the 
book is devoted to this part, which is 
nearly three times as long as Part I 
and four times as long as Part III. 

Part I lays the groundwork for 
the later exposition. It includes a 
twenty-page treatment on the struc- 
ture and functions of the eye, a sum- 
mary of the basic physical, psycho- 
logical, and psychophysical terms 
currently employed in the field of 
color perception, a discussion of 
methods of color matching (a) by ad- 
dition of lights, (b) by rapid succes- 
sion of lights, and (c) by mixture of 
colorants. The first part closes with a 
discussion of different types of color 
deficiency and a brief description of 
the better known tests of color blind- 
ness. 

The various tools and technics used 
by the color specialist are described 
in Part II. The reader is informed in 
some detail, with the aid of many dia- 
grams and with the necessary quanti- 
tative tables, concerning spectro- 
photometers, the standard observer, 
chromaticity diagrams, tristimulus 
values and tristimulus colorimeters, 
subtractive colorimeters, photome 
ters, photoelectric tristimulus colorim- 
eters, color standards, color scales, and 
color names. How each of these aids 
is to be applied in connection with 
diverse manufacturing problems is 
set forth in an easy running style with 
great clarity. Here the author has 
rendered important service to work- 
ers in the color practicum, not only 
by indicating what each of these aids 
is designed to do and how it is to be 
applied, but also by pointing out 
some of their limitations. 

Part III is devoted to special prob- 
lems, such as gloss, opacity or hiding 
power, clear and turbid media, which 
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have been subjected to intensive 
quantitative analyses. The solutions 
of these problems, sometimes in- 
volving exponential and hyperbolic 
functions, will be of interest chiefly 
to those of high mathematical com- 
petence. 

About fifteen pages are devoted to 
each of the three final sections of the 
work: (a) an appendix, containing 
quantitative tables too extended to 
be included in the main body of the 
text, (b) a list of references, and (c) 
the index. Although the list of refer- 
ences contains more than three hun- 
dred and fifty items, a number of 
very important contributors to color 
are omitted. There is no reference to 
E. Hering, G. E. Miiller, L. T. Tro- 
land, C. E. Ferree, and S. Hecht, to 
mention only a few of those no longer 
living. 

Of the several sciences sharing in 
the scientific study of color, physics 
and psychology fare somewhat better 


than physiology. However, the open- 
ing treatment in Part I is designed 
to provide some balance among the 
several sciences with overlapping in- 


terests in color. Despite the at- 
tempted integration among these 
sciences, it seems fair to say that 
Judd’s chief contribution consists in 
the description of the methods by 
which the physical correlates of the 
visual stimulus are to be specified. 
What seems to be left for other men 
of science, perhaps in electrophysiol- 
ogy, is the task of surveying more 
fully the possibilities of specifying 
and, in so far as possible, of rendering 
to quantitative terms the physio- 
logical correlates of the visual stimu- 
lus. 


MICHAEL J. ZIGLER. 
Wellesley College. 


KARN, Harry W., AND GILMER, B. 
VON HALLER. Readings in indus- 
trial and business psychology. New 


York: McGraw-Hill, 1952. 
ix +476. $4.50. 


The trend toward books of read- 
ings in various areas of psychology is 
continued with the publication of this 
volume. Some such books have tend- 
ed to present materials from all eras 
of psychology, some have had the ori- 
ginal articles edited rather sharply, 
and some have included extensive 
comments by the editors directed 
toward emphasis and integration. 
The present volume does none of 
these. The editors state that they 
have not attempted integration nor 
have they tried to cover articles of 
historical interest. They have tried 
to include ‘‘representative’’ articles 
which are easily understood by those 
lacking extensive technical training. 

The aim of recency is realized, 41 
of the 53 articles having been pub- 
lished since 1944. The articles are 
drawn from 19 different sources with 
approximately 25 per cent of the 
articles having been published origi- 
nally in either the Journal of Applied 
Psychology or Personnel Psychology. 
The articles are rather evenly spread 
among 11 ‘‘fields’’ of business and 
industrial psychology. There is some 
question regarding “representative- 
ness’’ in the selected articles. For 
example, in terms of content, the 
editors might have selected a more 
representative article on sampling in 
market research than the one by 
Stanton written in 1941 which makes 
no reference to probability sampling. 
Furthermore, the number of articles 
devoted to each of the fields is not pro- 
portionate to the amount of research 
or information in those fields. Al- 
though this reviewer is not opposed 
to disproportionate sampling of ar- 
ticles, the term representative is 
hardly used in its usual sense and 
should be read as illustrative. It is 
good to see sections devoted to the 
place of the psychologist in industry 
and to his ethical problems. The 


Pp. 
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comments made by the editors about 
each article are extremely brief and 
of no great value. These comments 
could have been expanded and might 
have included a few words about each 
author. 

As far as this reviewer knows, this 
is the first attempt to provide a book 
of general readings in business and 
industrial psychology since Moore 
and Hartmann’s Readings of 20 
years ago. The latter largely con- 
tained excerpts, whereas Karn and 
Gilmer present complete articles. 
Direct comparison of the books is 
probably unfair since the selections 
were not made on the same bases in 
the two books. However, a casual 
comparison shows that the present 
volume devotes more space to train- 
ing, counseling, job evaluation, mar- 
ket research, fatigue and efficiency, 
and leadership, and much less to 
tests and selection than did the ear- 
lier book. About the same amount of 
space is devoted to motivation and 
morale, and industrial relations. 

As the editors point out, this book 
should be a supplement to a system- 
atic text or else the teacher must or- 
ganize and integrate the subject mat- 
ter. The main advantage to this 
book is that a number of recent, il- 
lustrative articles are combined un- 
der one cover. 

LESTER GUEST. 

The Pennsylvania State College. 


YOUNG, KIMBALL. 
problems of adjustment. 
New York: 
Crofts, 1952. 


Personality and 


(2nd Ed.) 
Appleton-Century- 
Pp. x+716. $5.00. 


A considerable amount has been 
added to the first edition of this very 
readable book, which appeared first 
in 1940 (see Psychological Bulletin, 
1941, 188ff.). The author has in- 
cluded some fairly recent material 
from the perception-personality area, 
but the presentation is substantially 
eclectic. Every sort of approach has 


been covered, including George 
Mead'’s subjective analysis of the ‘‘I”’ 
and ‘‘Me,” anthropological and psy- 
choanalytical and field theories, and 
the personalistic contribution of Gor- 
don Allport, leading into case studies. 
Some might object to Young’s dis- 
inclination to take a point of view 
and stick to it. However, the average 
student will probably be benefited 
by the broad coverage. 

The author is quite skeptical of our 
progress in the strictly experimental 
study of personality measurement. 
He states, ‘there are a large number 
of theories (of personality) but un- 
fortunately they are not, for the most 
part, so stated as to furnish a bridge 
to empirical testing either in the 
laboratory or by other scientific de- 
vices.” 

Part I is mainly concerned with 
theories about personality and its 
development. There are traditional 
presentations of language and other 
forms of learning. Little has been 
added here, and the chapters on 
symbolic behavior and the self have 
apparently not been altered much. 
Both of these areas remain useful 
summaries of often neglected topics. 

Part II, concerned with problems 
of adjustment, is the most concrete 
and will doubtless be read with major 
interest. As in the first edition, 
there are chapters devoted to in- 
fancy, childhood, adolescence, the 
college student, marriage, and neu- 
roticism. ‘There is in addition a com- 
pletely new chapter on later maturity 
and old age, indicative of this rapidly 
expanding field. This reviewer feels 
that the chapter on _ psychological 
problems associated with occupation 
is given less up-to-date attention than 
it deserves, in comparison, for ex- 
ample, with the far greater considera- 
tion given to sexual and marital ad- 
justment problems. 

EDWARD S. JONEs. 

University of Buffalo. 
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THompPsON, GEORGE G. Child psy- 
chology: growth trends in psycho- 
logical adjustment. Boston: Hough- 
ton Mifflin, 1952. Pp. xxxiv+667. 
$5.50. 


Oftentimes the child psychologist 
finds it difficult to convince him- 
self or his colleagues that his area of 
interest has any specific contribu- 
tion to make to psychology as a 
whole. In this book Thompson pre- 
sents a convincing demonstration 
that when it is interpreted as the psy- 
chology of development (rather than 
as the psychology of children, to be 
contrasted with the psychology of 
apes or adolescents or old people) 
child psychology is a scientific dis- 
cipline in its own right. This is a 
book developed thoughtfully, care- 
fully, and with a high degree of 
scholarship. Obviously familiar in 


detail with the vast literature in de- 
velopmental psychology, the author 


has been guided in his selection of 
material by criteria of scientific valid- 
ity and pertinence of the material to 
the point he is trying to make. In 
other words, while the book is com- 
prehensive it is not padded. Each 
of its fourteen chapters is organized 
as a unit and these units contribute 
to a functional whole. 

Thompson’s objective seems to 
have been to present an integrated 
picture of the developmental process 
underlying human behavior. Where 
necessary to round out the picture 
he has felt free to use findings with 
infrahuman “children,”’ studies of 
individual cases, concepts from per- 
sonality theory and the like, pointing 
out frequently that in many impor- 
tant areas of child psychology our 
knowledge is all too sparse. But per- 
vading the whole is the insistence 
on the application of scientific prin- 
ciples in evaluating data. Consider- 
ations such as numbers of cases, the 
design of studies, and the reliability 


of observation are given repeated and 
consistent emphasis. 

At various points in the book it 
becomes apparent that behavioral de- 
velopment does not always take 
place in a desirable direction, and at 
these points the author is particularly 
interested in examining any evi- 
dence which might explain such 
trends. This leads to careful con- 
sideration of methods of child rearing 
and guidance and education, in our 
own culture and in others. 

In his general orientation, the 
author's position is not far from that 
of the modern behaviorist, but he 
has given considerable effort to the 
task of presenting fairly such mate- 
rials as projective techniques and the 
theory underlying their use, if with 
an eclectic sort of damning with faint 
praise. Placing a premium upon 
factors of validity and reliability, it is 
natural that he would say of these 
techniques, ‘‘we should be cautious 
about our interpretations until more 
objective methods of scoring and 
interpretation haye been worked out, 
and until more validation studies 
have been conducted with positive 
results” (p. 620). | Exercising perhaps 
the same requirement of objective 
proof, the author gives almost no 
space at all to therapy as such. 

This book will be well received by 
those who feel that child psychology 
should be taught as a science. It is 
written in a fashion sufficiently inter- 
esting to be appropriate for the non- 
psychology major who wants to learn 
something about children before gen- 
erating his own, perhaps, and in a 
fashion sufficiently scholarly and 
comprehensive to satisfy the gradu- 
ate student looking for a standard 
reference in this field. This book is to 
be highly recommended. 


T. W. RICHARDs. 
Louisiana State University. 
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